The Book of Matthew
Overview
The Book of Matthew in the UPDV is a reconstructed text. The Greek Matthew that survives in all known manuscripts appears to have undergone significant modification before the earliest copies were made — modifications that are invisible to standard textual criticism because they predate the manuscript tradition itself.
The UPDV reconstruction removes material identified as later editorial additions, restores chronological order where the text was rearranged, draws on parallel accounts in Mark and Luke where Matthew's text is unreliable, and renumbers the resulting chapters and verses. The reconstruction retains approximately 70% of the canonical Matthew text.
This article explains the evidence behind the reconstruction in three layers of depth: a plain summary, the historical and textual evidence, and the computational validation performed in 2026.
The Problem in Plain Terms
Matthew's Gospel was originally composed in Hebrew or Aramaic. At some point it was translated into Greek, and during or after that translation, an editor made substantial changes. He added birth stories at the beginning, inserted prophecy citations throughout, rearranged teaching material into long thematic speeches, amplified miracles and punishments, and added narrative details drawn from Greco-Roman literary conventions rather than eyewitness tradition.
These changes were made so early that every surviving Greek manuscript contains them. There is no "unedited" copy of Matthew to compare against. But the modifications left traces:
- Other Gospels don't have them. Mark and Luke record the same events without the additions. When Matthew has extra material that Mark and Luke lack, that material consistently shows patterns of literary embellishment, theological agenda, or stylistic features foreign to the rest of the book.
- Early witnesses knew a different text. Church fathers writing in the second through fourth centuries reference Hebrew versions of Matthew that lacked the birth narrative. Some communities preserved traditions of a substantially shorter gospel.
- The text contradicts itself. The editor's insertions sometimes create logical tensions with surrounding material — prophecies that don't quite fit their context, doublings of characters that the parallel accounts don't support, and sayings pulled from their original settings into artificial speech collections.
The UPDV reconstruction identifies these editorial additions using multiple independent lines of evidence and removes them to recover a text closer to what the original author wrote. Where removal creates gaps, parallel accounts from Mark and Luke fill them. The resulting text covers the same events and teachings but without the later editorial overlay.
This is not a compilation from Mark and Luke — it is Matthew's own Gospel, recovered. Seventy percent of the canonical text survives, including the narrative framework, the teaching material, and the genealogy. The reconstruction preserves what this author uniquely contributed while removing what a later editor imposed. For readers who want to see exactly what changed, the UPDV provides 28 chapter-by-chapter text comparisons showing the old and new text side by side, a source chart identifying which Gospel each verse was drawn from, and a verse renumbering chart mapping old verse numbers to new.
The Evidence
The Hebrew Original
By most accounts, Matthew was originally written in Hebrew or Aramaic. Although no copies from a direct line to the Hebrew version survive, several indirect witnesses preserve traces of it. Church fathers reference a gospel written in Hebrew, noting where it differs from the Greek. They mention different Hebrew versions, different communities using them, and various concerns about these texts and groups. This provides important information about the shape of the original.
Witnesses to a Substantially Different Text
The missing opening chapters. Epiphanius of Salamis (Panarion) knew of Hebrew versions that did not contain the first two chapters. He also indicates that some versions retained the genealogy but not the birth narrative, suggesting the infancy material was a later addition to an existing text.1
The Jewish Christian tradition. In 1966, Shlomo Pines described a text reflecting the views and traditions of a Jewish Christian community. Though transmitted through later centuries, this tradition appears to reach back to the earliest period of Christianity. These texts imply that the "true" Hebrew Gospel did not contain an account of the birth and life of Jesus as found in the canonical Matthew.2
Matthew 1:16 and the virgin birth. Multiple early witnesses — Manuscripts R and O of The Dialogue of Timothy and Aquila, the Old Syriac (Sinaiticus), Palestinian Syriac witnesses, and Von Soden's critical text — show that Matthew 1:16 was modified very early to support the concept of a virgin birth.3
Dream narratives and Greco-Roman parallels. The dream stories in Matthew's infancy narrative share formal features with Greco-Roman literature, particularly the ancient novels. Chariton 2.9.6 offers a close parallel to Matthew 1:18b-24: just as Callirhoe resolves a crisis about her unborn child through a dream, Joseph resolves his crisis about Mary's child through a dream. The dream in Matthew 27:19 (Pilate's wife) uses the same terminology, linking the beginning and end of the editorial layer.4
The Synoptic Evidence
The strongest evidence comes from comparing Matthew with Mark and Luke. Scholars have long recognized that Matthew drew on two primary sources: the Gospel of Mark (providing the narrative framework) and a sayings collection shared with Luke (known as Q, from German Quelle, "source"). Material found only in Matthew — called Sondergut or "M" — requires special scrutiny.
When the three Gospels are placed side by side:
- Where Matthew follows Mark, the text is generally reliable. Matthew preserves Mark's narrative order, often with minor stylistic adjustments. The UPDV retains 88% of this material.
- Where Matthew shares sayings with Luke (Q material), the text is usually authentic but sometimes displaced. Matthew collected sayings from different occasions into large discourse blocks (the Sermon on the Mount, the Mission Discourse, the Parables Discourse, etc.), while Luke generally preserved their original contexts. The UPDV retains 76% of this material, sometimes restoring the Lukan order.
- Where Matthew stands alone (Sondergut), the text is most suspect. Only 22% of Matthew's unique material is retained in the UPDV. The rest consists of editorial additions: the infancy narrative, prophecy fulfillment formulas, eschatological amplifications, and narrative embellishments.
Patterns of Modification
When Matthew's unparalleled material is examined systematically, consistent patterns emerge:
- Prophecy fulfillment formulas. Fourteen times, the editor inserted a citation introduced by a phrase like "this was to fulfill what was spoken by the prophet." These formulas share a distinctive literary style that links them to the infancy narrative — the same editorial hand wrote both.5
- Sensationalization. Where Mark reports one demoniac, Matthew has two (8:28). Where Mark describes a straightforward healing, Matthew adds dramatic elements. The pattern is consistent: unparalleled details in Matthew tend to amplify rather than preserve.
- Thematic rearrangement. The editor gathered sayings from different occasions into artificial discourse blocks. Matthew chapter 13 collects parables spoken at different times into a single "Parables Discourse." Chapter 23 collects criticisms of the Pharisees into a single denunciation. Chapter 24-25 combines eschatological material from multiple settings. This rearrangement changes the meaning of individual sayings by removing them from their original contexts.
- Dream plot devices. Six times in Matthew, a dream redirects the story (1:20, 2:12, 2:13, 2:19, 2:22, 27:19). Every occurrence is in unparalleled material. This is a narrative technique drawn from Greco-Roman literary convention, not from the Hebrew tradition.
- Eschatological intensification. The phrase "weeping and gnashing of teeth" appears six times in Matthew, always in unparalleled material. The formula "the end of the age" (συντέλεια τοῦ αἰῶνος, synteleia tou aiōnos) appears five times, never in the other Gospels. These are editorial stamps — a consistent theological emphasis added throughout the book.
What Was Removed
The reconstruction removes approximately 30% of the canonical text. The major categories:
| Category | Examples |
|---|---|
| Infancy narrative | Birth story, Magi, flight to Egypt, massacre of innocents (1:18–2:23) |
| Fulfillment formulas | "This was to fulfill what was spoken by the prophet..." (14 instances) |
| Discourse transitions | "When Jesus had finished these sayings..." (5 instances) |
| Eschatological additions | Weeping and gnashing codas, end-of-age formulas (e.g., 13:39, 28:20) |
| Narrative embellishments | Pilate's wife's dream, blood curse, resurrected saints, guard at the tomb |
| Unattested teachings | Material with no parallel and showing editorial characteristics |
What About the Parables?
The most debated category is the parables found only in Matthew: the Workers in the Vineyard, the Ten Virgins, the Sheep and Goats, the Unmerciful Servant, the Two Sons, the Hidden Treasure, the Pearl, and the Dragnet. Davies and Allison identify these as a probable pre-Matthean parable collection — genuine traditions that Matthew's editor inherited, not editorial inventions.6
Computational analysis supports this: the parables cluster stylistically with the authentic core of the Gospel, not with the editorial layer. Their language is Semitic in character, they lack the editorial vocabulary fingerprints found in the fulfillment formulas and infancy narrative, and several use βασιλεία τοῦ θεοῦ (basileia tou theou, "kingdom of God") rather than Matthew's characteristic βασιλεία τῶν οὐρανῶν (basileia tōn ouranōn, "kingdom of heaven") — suggesting they predate the editor's theological vocabulary.
However, knowing that a parable is authentic does not tell us where it originally belonged. Matthew's editor arranged these parables thematically, not chronologically. Restoring them to the text would require choosing a placement, and every placement changes the meaning of the parable and its surrounding context. The UPDV has chosen not to include material whose original context cannot be determined, even when the material itself is likely genuine. This remains the one area where the reconstruction may be more conservative than the evidence strictly requires.
The Reconstruction Method
The UPDV reconstruction uses existing Matthew as the primary source wherever possible. Where Matthew's text appears unreliable, parallel accounts from Mark and Luke are drawn upon. In some cases, readings from multiple Gospels are combined. Passages lacking any confirming witness — whether to the reading itself or its context — are not included. Some texts that appear genuine are included even where their original location is uncertain; footnotes indicate this. Slight narrative adjustment was occasionally required for transitions.
Renumbering
The chapter and verse numbering in Matthew has been changed. When it is necessary to refer to both numbering systems, the old system is indicated by "(old)" or "(old numbering)" and the new system by "(new)" or "(new numbering)." Where neither is specified, the new system should be assumed.
Full reference materials — including the verse renumbering chart, source chart, and chapter-by-chapter text comparisons — are available in the Matthew Reconstruction Reference.
Computational Validation (2026)
In 2026, a comprehensive computational analysis was performed on the Greek text of Matthew to test whether the 2005 reconstruction could be independently validated using quantitative methods. The analysis used the PROIEL Treebank (a linguistically annotated corpus of ancient Greek texts in Universal Dependencies format) as its data source, providing word-level lemmatization, part-of-speech tagging, and morphological features for every token in the Gospel.
Change Point Detection
A sliding-window analysis (600 tokens, step 100) was applied across the full text of Matthew, measuring shifts in function word frequencies and part-of-speech patterns. The algorithm — a kernel-based change point detector operating on PCA-reduced feature space — identified eight statistically significant breaks in the text. All five of Matthew's major discourse blocks were detected blindly, without any prior information about the Gospel's structure. The strongest break corresponded to the end of the Sermon on the Mount.
A micro-windowing analysis of chapters 1-5 (250-token window) revealed that chapters 1-2 are completely isolated in the feature space from chapters 3-4, confirming that the infancy narrative is stylistically distinct from the rest of the Gospel.
Translation Greek Classification
A classifier trained to distinguish Translation Greek (Greek translated from a Semitic original, using the Septuagint as training data) from native Koine composition (using Josephus and the Greek novels) was applied to Matthew chapter by chapter.
Result: All of Matthew classifies as Translation Greek. Chapters 1-2 scored 70-77% Translation — Semitic in syntax, not Greco-Roman. This refutes the hypothesis that the infancy narrative was composed in a classical Greek literary style (the "Pindaric" hypothesis referenced in earlier editions of this article). The correct diagnosis is that the infancy narrative imitates Septuagint style — it is written by someone thinking in Semitic syntax and deliberately echoing the Greek Old Testament. The editorial conclusion (later addition) stands, but the linguistic mechanism is Septuagintal pastiche, not secular literary composition.7
Synoptic Layer Separation
When Matthew's text is divided by source — Mark-parallel material, Q material (shared with Luke), and Sondergut (unique to Matthew) — and each layer's stylistic profile is measured independently:
- Mark-parallel material clusters with the full text of Mark (PCA distance 0.93)
- Q material clusters with Luke's Q sections (distance 0.98)
- Sondergut floats alone, matching neither
This confirms that Matthew was assembled from identifiable source layers whose syntactic fingerprints survive the compilation process. The compiler smoothed the vocabulary but failed to overwrite the underlying syntax of his sources.
The Editorial Fingerprint
The most significant finding: the 14 fulfillment formulas scattered throughout the Gospel share a micro-syntactic fingerprint with the infancy narrative (chapters 1-2). PCA distance between the fulfillment formulas and the infancy narrative is 0.93 — closer than any Gospel Core layer is to any other Gospel Core layer (internal distances 0.45-0.61). Bootstrap validation (2,000 iterations) confirmed this clustering at p < 0.001 with non-overlapping confidence intervals.
Conclusion: The fulfillment formulas and the infancy narrative were written by the same editorial hand. This hand is stylistically distinct from the compiler of the Gospel Core.5
Seven-Axis Evaluation Framework
Every verse in Matthew (1,068 total) was scored across seven independent axes of evidence:
| Axis | What It Measures |
|---|---|
| External Attestation | Is this verse paralleled in Mark and/or Luke? |
| Synoptic Divergence | How much does Matthew's version differ from its parallels? |
| Fulfillment Formula | Does this verse contain a prophecy citation formula? |
| Contextual Displacement | Is this verse out of order relative to the parallels? |
| Computational Layer | What does the NLP layer analysis say? |
| Semitic Substrate | Does this verse show Hebrew/Aramaic syntactic patterns? |
| Redaction Profiler | Does this verse contain known editorial vocabulary fingerprints? |
Each axis scores from 0.0 (strongest evidence for authenticity) to 1.0 (strongest evidence for editorial origin). The weighted composite produces tier assignments:
| Tier | Meaning | Verses | % |
|---|---|---|---|
| A | Highest confidence authentic | 496 | 46.4% |
| B | High confidence authentic | 430 | 40.3% |
| C | Mixed signals | 93 | 8.7% |
| D | Multiple editorial indicators | 42 | 3.9% |
| E | Strongest editorial indicators | 7 | 0.7% |
The framework agrees with the 2005 reconstruction on 73% of verses. Where it disagrees, the disagreements fall into predictable categories: the framework identifies some omitted parables as Tier A/B (authentic but lacking external attestation), and some retained editorial transition verses as Tier D/E.
The Redaction Profiler
The profiler is a deterministic rule engine that scans each verse for known editorial patterns — not just vocabulary, but behavioral tells (recurring narrative devices that function as editorial shortcuts). Key detections:
- Dream plot device (ὄναρ, onar): 6 occurrences, all in editorial material. 100% hit rate. The editor uses dreams the way a screenwriter uses flashbacks — to redirect the story without motivation.
- Fulfillment formula: πληρόω (plēroō, "fulfill") + ῥηθέν (rhēthen, "spoken") or προφήτης (prophētēs, "prophet"). The signature editorial insertion.
- Weeping and gnashing: κλαυθμός (klauthmos) + βρυγμός (brygmos). Appears 6 times, always in unparalleled material. An editorial punishment formula.
- Kingdom of Heaven substitution: The editor systematically replaced βασιλεία τοῦ θεοῦ ("kingdom of God," used by Mark and Luke) with βασιλεία τῶν οὐρανῶν ("kingdom of heaven"). The few places where "kingdom of God" survives in Matthew (e.g., 21:31) are evidence of pre-editorial tradition showing through.
Of 1,068 verses, 834 (78%) show no editorial fingerprints. The remaining 234 (22%) are flagged with specific explanations of which patterns triggered.
Scholarly Confirmation
The reconstruction's approach corresponds closely to the critical consensus as expressed in the standard scholarly commentary on Matthew:
"Two facts are immediately apparent. First, a great portion of the material found in neither Mark nor Luke is redactional or partly redactional." — Davies and Allison6
Davies and Allison identify the same source layers the computational analysis detected: Mark as narrative backbone, Q as teaching source, and M (Sondergut) as a mixture of editorial composition and a small collection of pre-existing traditions (primarily parables). They reject the hypothesis of a unified "M document," concluding that Matthew's unique material comes from multiple sources — some genuine, most editorial.
On the displacement of Q material:
"Despite the different arrangements (which we attribute almost exclusively to the Matthean redaction), the material in Luke and Matthew reflects a common order." — Davies and Allison6
This confirms that the chronological rearrangement in the UPDV — particularly the restoration of sayings to contexts closer to Luke's order — has scholarly support.
Summary of Computational Findings
| Finding | Method | Significance |
|---|---|---|
| Chapters 1-2 stylistically isolated | Change point detection | Confirms infancy narrative is a distinct layer |
| All of Matthew is Translation Greek | Supervised classifier | Refutes Pindaric hypothesis; supports Hebrew original |
| Source layers detectable in syntax | PCA clustering | Confirms Two-Source Hypothesis computationally |
| Fulfillment formulas = same hand as infancy | Bootstrap validation | p < 0.001; non-overlapping confidence intervals |
| M parables cluster with Gospel Core | Stylometric clustering | Pre-Matthean tradition, not editorial invention |
| 78% of verses have no editorial fingerprints | Redaction Profiler | Most of Matthew is transmitted tradition, not editorial |
| Framework agrees with 2005 reconstruction 73% | 7-axis scoring | Independent validation of ad-hoc methodology |
The computational analysis, performed two decades after the original reconstruction, independently validates the macro-level editorial decisions while providing quantitative precision the original methodology could not. The one area where the computational evidence challenges the reconstruction — the M parables — remains unresolved due to the displacement problem: authentic material whose original context cannot be recovered.
Notes
- Epiphanius of Salamis, Panarion (Adversus Haereses). On the Hebrew Gospel used by the Ebionites and Nazarenes.
- Pines, Shlomo. The Jewish Christians of the Early Centuries of Christianity According to a New Source. Jerusalem: Israel Academy of Sciences and Humanities, 1966. Pages 21, 23.
- The Dialogue of Timothy and Aquila (TA), Manuscripts R and O at 17.3ab. Also: Old Syriac (Sinaiticus), Palestinian Syriac witnesses, and Von Soden's critical text.
- Dodson, Derek S. "Dreams, the Ancient Novels, and the Gospel of Matthew: An Intertextual Study." Perspectives in Religious Studies 29 (Spring 2002): 46-47, 51.
- Computational stylometry analysis (2026). PROIEL Treebank data, bootstrap validation (2,000 iterations).
- Davies, W. D. and Dale C. Allison Jr. A Critical and Exegetical Commentary on the Gospel according to Saint Matthew. ICC. 3 vols. Edinburgh: T&T Clark, 1988–1997. Vol. 1, pp. 121–127.
- The earlier characterization of the infancy narrative's style as matching "the scholia on Pindar" (Abel, Scholia recentia in Pindari epinicia, 1891) identified a real stylistic anomaly but misdiagnosed the mechanism. Computational classification shows the syntax is Semitic (Translation Greek), not classical. The anomaly is Septuagintal imitation — the editor wrote in a deliberately archaizing biblical Greek, not secular literary Greek.