Stopped this week to think again about versions of the text available to us. What would be the shortest path for someone who wanted to reproduce this work later? Not such an easy problem. Read on for more.
I ended last week with a list of views in a new app that needed to be coded. I started cleaning up the planner code to run in the new system and started looking at the file formats for the text under test.
I really did not like the formats I've been using for years so the whole pipeline from download to the solver was open to rethinking. The solver wants each book to be a flat list of words, without express verse breaks, so it works as a sliding window against a list of words.
Verse markers are simply tags on specific words in that list, to help the user know where they are. This is a format style I've never seen done with Bible texts.
File format conversions begin right after download, so I stepped back and looked at the texts that are the starting point for this work. It has been 2 years since I thought much about this, and I went back to first principles and rechecked some of the original online sources.
For several years we have thought about the problem along the lines of building a 'critical edition' which would combine the versions available to us.
This is an odd starting point from the perspective of repeatability. Most scientific work is not considered done until all the experiments have been reproduced.
I would hope that someone somewhere would take an interest in copying the work once we have shown how the text's audit pattern actually works.
So the starting documents, the text we use, becomes important. Maybe we don't want some sort of custom text. Maybe we want to target some obvious readily available version and start there?
As we understand Matthew 13, the problem will be solved by someone because Joshua has left the problem to be solved. So the solution must start with docs that Joshua has expressly protected across history. So there must be a starting point document set.
OK, so I set out to think again about primary sources. The best I am aware of for the OT Peshitta is on github. Here is the link for those who want to browse around.
This is run by a group associated with a university in Amsterdam. This is their online repository. If you clone it you have the best quality formatting for the OT Peshitta files that I am aware of.
This is still for computer people. It is not an app or anything similar. But it is what a computer person would want. This would be my recommended starting point for the OT part of the problem.
This week I redid the clone, because of a new 2021 version, and then wrote the code to do the file conversions needed from that repository to the new solver. This was not very hard, a few hours of work and then some debugging.
But, there are still a bunch of problems. This is not a group particularly focused on the Peshitta. They are mostly focused on the Hebrew OT. So they are not doting over having a good quality copy of the Peshitta text letter-for-letter. The biggest problem when I went to check their work against other versions was finding verses out of order.
This is especially so in Lamentations, which is an acrostic, so the text itself screams of being out of order. So this is an easy to get source, but needing careful examination against other texts before use.
For the NT, all roads seem to head back to the 1905 British Bible Society Critical Edition of the NT Peshitta. In this case many other downloads, including the Sword module, cite their source as tertullian.org. The interesting page is not easily reached from the front of the site. Here is the key link.
These are html pages, and it takes considerable coding wizardry to download and reformat. Not hard for an adept programmer. Probably mind boggling for a beginner.
In any case, this seems to be the top of the food chain for all other easily downloaded copies of the NT Peshitta. Some, like the Sword, modules have verses out of order and encoding problems that will need cleanup for anyone using those.
Both of these sources are expressly public domain, which is also very good. So a student can use these sources without the need for a grant.
These 2 starting points need to be checked against other versions. The NT not so much, but the OT especially. There is an alternative OT source, also run by a university. Also not reachable from the front page of their website. Also expressly public domain for the Peshitta itself. Here is the link for that source.
The problem here is they are missing Daniel and the 5 books from the NT not considered canonical. There has been no updates in 2 years, so beware. It is tied into an online database for other purposes, which might be helpful.
Obviously at this point our starting text requires considerable work just to establish it is trustworthy. We have a paid version of the Peshitta that we purchased years ago when we started thinking about Aramaic. It does have parallel English, but otherwise is even lower quality than the links I've given above.
By comparing various versions to each other, the files themselves can be checked for sanity. Which is what anyone wanting to reproduce the work will need to do. There does not appear to be any way around it.
If anyone knows of any other free, public domain, sources for either the OT or NT Peshitta, please send me a link.