David A. Tanzer, January 2021
How complex is the English language? That’s a tough nut to crack. Here we’ll reframe the question in a more formal setting which is shared by computer science and linguistics: the theory of formal languages and their complexity. By the end, we’ll reach a technical understanding of what is meant by the famous conjecture P != NP.
To talk about the unit in general, and … Read more
David A. Tanzer, January 2021. Cross-posted to Azimuth.
A simple view of languages
How complex is the English language? The question has many dimensions and nuances. What does complexity mean, and how could it be measured? This a tough nut to crack, so in this post we’ll make a retrenchment and reconsider the question in a more formal setting — computer science.
In this setting, a language gets identified … Read more
David A. Tanzer, January 2021. Cross-posted to Azimuth.
Decision problems, decision procedures and complexity
In Part 1, we introduced the formal idea of a language, as being just a set of sentences. Now let’s approach the topic of language complexity.
For any given language, there is an associated decision problem: given a candidate string of characters, determine whether or not it belongs to the language. A decision … Read more
David A. Tanzer, January 2021. Cross-posted to Azimuth.
A detailed measure of computational complexity
In Part 2, we identified language complexity by how “difficult” it is for a program to decide if some given text is a sentence of the language — in other words, by the complexity of the decision problem for the language.
Note there may be many decision procedures — which are implementations of functions … Read more
David A. Tanzer, January 2021. Cross-posted to Azimuth.
Summarizing computational complexity
In Part 3 we defined, for each program P, a detailed function P'(n) that gives the worst case number of steps that P must perform when given some input of size n. Now we want to summarize P into general classes, such as linear, quadratic, etc.
What’s in a step?
But we should clarify something before proceeding: what … Read more
David A. Tanzer, January 2021. Cross-posted to Azimuth.
Big O analysis
Recall our function $f(n)$ from Part 4, which gives the values 2, 13, 14, 25, 26, 37, …
Using ‘big O’ notation, we write $f(n) = O(n)$ to say that $f$ is linearly bounded.
This means that $f(n)$ will eventually become less than some linear function.
As we said, $f(n)$ has a “rough slope” of 6. So … Read more
David A. Tanzer, January 2021. Cross-posted to Azimuth.
Quadratic complexity
In Part 5 we introduced big O notation for describing linear complexity. Now let’s look at a function with greater than linear complexity.
def square_length(text):
# compute the square of the length of text
# FIXME: not the most elegant or efficient approach
n = length(text)
counter = 0
for i = 1 to n:
for j = 1
…
Read more
David A. Tanzer, February 11, 2021. Cross-posted to Azimuth.
Higher complexity classes
In Part 6, we saw a program with quadratic complexity. The collection of all languages that can be decided in $O(n^k)$ time for some $k$ is called P, for polynomial time complexity.
Now let’s consider languages that appear to require time that is exponential in the size of the input, and hence lie outside of P.
Here … Read more