One of the most ubiquitous concepts found in programming languages is that of a function.
There are many things that functions can do, and this makes it hard to come up with a concise definition of a function, even if we narrow our focus to a single language.
Functions In A Programming Context
Consider some things we could say about what functions do in different programming contexts:
- A function can accept any number and variety of inputs, including no inputs
- A function can cause text to be rendered to an output environment such as the browser, a terminal console, etc.
- A function can return a value, although it doesn’t have to
- A function’s return value may or may not depend on the inputs or the state of the underlying program
- A function can, in some cases, change the value of external variables or otherwise alter the state of the underlying program
This is quite a range of possibilities, and there are probably some things you can think of that I am missing from my list.
Definition, Please?
It’s probably impossible for all programmers to come to agreement on a definition of the word function, but I imagine that such a thing might go something like this:
In general programming terms, a function is a routine that accepts zero or more inputs, performs a task or algorithm, and possibly returns a value.
While I am not offering this as a definition that others should adopt, it’s as close as I can personally get to stating a definition.
Functions In A Mathematical Context
Mathematics is an area where, if there is uncertainty about the definitions of any key terms, the person reading or writing the math will not be able to proceed very far. Math is basically the science of asking “why” until you are forced to state a definition, an assumption, or cite an already established proof.
We can use functions in all different kinds of ways in programming without necessarily caring to define the word, in large part because the computer lets us interact with functions and we can learn by mimicking, experimenting, and observing.
While this may be true to some extent in math as well, we really aren’t able to have words with wide contexts and interpretations in math like we sometimes can in programming.
An Historical Tangent
Consider for example the story of Euclid, an ancient Greek mathematician who lived around 300 B.C. He collected known theorems from his contemporaries in geometry and organized a 13 volume book of geometric proofs called The Elements that starts off with 23 definitions and 8 assumptions.
Euclid’s last definition of Book I is for parallel lines, and it is strongly intertwined with one of his assumptions.
When mathematicians in the early 1800’s began to formalize a system in which the assumptions around parallel lines differed from Euclid’s, it gave birth to a new branch of math called Non-Euclidean Geometry.
The point here is that a new branch of math was created 2100 years after Euclid lived, because some folks decided to change what they meant by the word parallel. Since the smallest twist in meaning or understanding can have such a high impact, I think math is a great device for informing and challenging my understanding of programming.
Working Toward A Formal Definition
To define a function in mathematical terms, we need to have the concept of an input set and an output set. We can grab any item from the input set, feed it into the function, and expect that the function will return an item from the output set in a well-defined, predictable, and consistent way.
The key idea and defining characteristic of a function in mathematics is that for any given input from the input set, there must be one and only one corresponding output from the output set. Under no circumstances can the same input ever yield differing outputs.
Taking A Non-Example
In math terms, the square root relation is a good counterexample which violates the definition of a function. Let’s consider the square root relation with the assumption that the input set is all non-negative real numbers and the output set is equal to all real numbers.
We can imagine feeding a non-negative real number into a “square rooting” machine and having the machine give us back a real number which is the square root of the given number.
The catch is that the output is not necessarily predictable or consistent, for if we put in 4 we might get back either 2 or -2. If we put in 9, we might get back 3 or -3. In short, there are two possible square roots for every non-zero number so the square root relation is not a function.
As a side note, we often turn the square root relation into a function by restricting the output set to be the set of all non-negative numbers. This ensures that the function always returns the non-negative version of the square root and thus qualifies as a function.
Taking An Actual Example
A quick and easy example of a mathematical function would be adding one to a number. Give the “plus one machine” any real number and it gives you back another real number equal to your number plus one.
In this case, there’s no chance for ambiguity and no way to get two possible answers. No matter how you feel, what’s going on in the world, what program you are running, or whether you have administrative privileges, you are always going to get back the same exact response for any given input.
In essence, this points to exactly what defines a function in the mathematical sense – the existence and uniqueness of an output for any given input from the input set.
A Function In Math Is Just A Potentially Infinite Key/Value Store
When we think of an array as a key/value hash (as opposed to a list), there are strong parallels to the concept of a function in mathematics. In fact, this is possibly the best example in programming to illustrate the mathematical version of a function, because every key that exists in the array must be assigned a value yet cannot be assigned more than one value.
Let’s suppose we have the following array.
We can think of this associative array as a function which has { 'foo', 'baz' }
as the input set and the set of all strings as the output set.
We could also get more strict and consider { 'bar', 'bat' }
to be the output set, but we usually don’t need to strictly specify the output set the same way we do for the input set. After all, the values of an array can often vary widely but our keys are more meaningful and important to the function mechanism underlying it.
Now, what happens if we want to change the value associated with a key?
Note that, like a mathematical function, we are not allowed to have $my_array['baz']
equal to two different values. The key ‘baz’ must have one and only one value, so the new value completely replaces the old one.
A similar thing happens if we try to explicitly list two different values for the same key within the array.
Again, we have 'bae'
replacing 'bat'
as the value associated with the key 'bae'
, because no key is allowed to yield two different values.
“Well wait a second,” you might say, “I can assign another array as the value associated with any given key.” And you’d be right about that!
This is totally valid, and in a sense there are two values ('bat'
and 'bae'
) associated with the key 'baz'
. Strictly speaking though, there is still only one value; and that is the array itself which happens to contain two values within it.
Note in the last example that changing the value of $my_array['baz']
to an array has altered the output set of the “function” represented by $my_array
. We are no longer outputting elements from “the set of all strings”, so we could say that the output set equals “the set of all strings unioned with the set of all arrays”.
To sum up, associative arrays (or hashes or key/value stores) are a concept in programming that almost perfectly embody the mathematical definition of a function. In both cases, we have a set of keys, and each key leads to one and only one value.
One big difference is that in math our set of keys can easily be infinite (such as the set of all real numbers or the set of all integers), and so the metaphor breaks down when we realize that an array cannot contain infinitely many items. If we wrote a program that attempted to push infinitely many items to an array, our program would run out of memory and could not be completed in real time.
Drawing Parallels
We have explored the fact that in programming, a function need not meet the strict mathematical definition of a function. We also saw that associative arrays are much more like mathematical functions because they embody the idea of an input yielding a unique and well-defined output.
To contrast, I consider that at the beginning of my programming journey, despite all of my math training up until that point, I would not have blinked or thought twice about writing this function and being happy with it as long as it “worked”:
I’m not saying it’s bad to have a function with no inputs, just that it is much different than how functions work in math and therefore worth thinking about and reflecting on for me.
Math has helped me to understand that when I write a function with no inputs, I very much need to think about the outside factors (or “indirect inputs”) I might be using within my function. If I don’t consider this, I could be bringing in unpredictable dependencies to my function that could lead to bugs or vulnerabilities.
With the aim of being more predictable, consistent, and up front about what the true influencing factors are for any given function, I might rewrite the above more like this:
This lets us do the logic elsewhere regarding whether the user is logged in or not. In short, it allows our function to not really care about external things and just do its one thing based on the given input.
Conclusion
I think it’s valuable to try and find patterns and habits that work their way into my code, and to always be looking for distinctions in mindset that can help me write better code and uncover my assumptions.
Personally, math is the best-fit tool in my toolbox for approaching this kind of thought. I hope that by continuing to write and share, I can encourage others to explore their own habits, patterns, and assumptions.
Leave a Reply