Language Complexity, Part 3

      No Comments on Language Complexity, Part 3

David A. Tanzer, January 2021. Cross-posted to AzimuthUnit: Language Complexity – All Posts – Prev – Next

A detailed measure of computational complexity

In Part 2, we identified language complexity by how “difficult” it is for a program to decide if some given text is a sentence of the language — in other words, by the complexity of the decision problem for the language.  

Note there may be many decision procedures — which are implementations of functions – that solve the decision problem for a language. For our purposes, we will focus our attention just on the ones which give the best performance. As motivation for this focus, consider that our simple language ALL_A = {“A”, “AA”, “AAA”, …} could be decided by an ill-conceived program which does all sorts of unnecessary work – but that would say nothing about the complexity of ALL_A itself. On the other hand, the procedure which simply scans the characters and verifies that all are ‘A’ is optimal, and simple — which shows that ALL_A is itself a simple language.

Our goal now is to quantify the notion of language complexity. That boils down to the matter of how to quantify the amount of work a program — in this case, the language decider — needs to do.

Computational complexity

Say we have a program P that inputs a string, goes through some steps, and outputs some results. For each input string, we can count how many computation steps it leads to. Let StepCount(P,x) be the number of steps needed to compute the result from x.

Now let’s consider the effect of the size of the input on the number of steps. 

In general, we expect larger inputs to trigger longer sequences of computation steps. In the first place, it takes more steps to scan more text. And moreover, as the analysis gets larger, more steps will be involved.

Checking whether all characters are ‘A’ takes order N work, where N is the size of the input string.

For something that requires more than order N work, consider a program which takes as input the text representation of a number, parses it, and computes the number of factors in the number. The analysis of factors calls for a significant amount of computation beyond the N steps to scan the input. Moreover, this work will become larger and more complex as the input values become larger.

These facts are indicators of what is known as the time complexity of the computation.

Sidenote: the space complexity of a computation pertains to the amount of memory it uses, as a function of input size. In our context, we are considering just the time complexity, i.e., the running time as a function of input size.

Worst-case complexity

It is fair to ask what is meant by the number of computation steps required to compute the output, for input of size N. After all, there are multiple input strings of size N, each of which may trigger a different number of computation steps. For instance, take our loop-based decision procedure for ALL_A. Input “AAAAA” causes 5 computation steps (as the loop runs through to completion). But “AABAA”, also of length 5, causes only 3 computation steps (as the loop terminates once it sees the ‘B’).

What we care about here is the maximum number of steps that will be needed for a given input size. In other words, we focus our analysis on the worst-case behavior for each input size N. The motivation here is that any bound on running time that we put on the worst-case inputs of size N gives us a bound for all inputs of size N.

Let MaxSteps(P,N) be the maximum number of steps that program P may perform for an input of size N.

Let P'(N) = MaxSteps(P,N). 

So P’ is the function that gives us the maximum number of steps that P will perform for any input size N.

P’ is our first step towards quantifying the time complexity of the program P. 

But for typical purposes, P’ presents an overly detailed description of complexity.

For instance, take a decision procedure P for ALL_A. We would like to be able to summarize the time complexity of P by saying that it is linear in the size of the input string. That information is contained in the structure of the function P'(n). But P’ offers a much more granular description, with one value for every natural number n.

In the next posts we will look into the analysis of P'(n) into general complexity classes, such as linear, quadratic and exponential.

Copyright © 2021, David A. Tanzer. All Rights Reserved.

Leave a Reply