Pure functions are functions whose result totally depends on its arguments and it doesnt make any side effecting changes. One of the important property of such functions is given same arguments would give back same results. Lot of languages don’t have restrictions on functions to have that property. But its nice to have that in place since its much easier to reason about functions like that. We have enough tools in mathematics to actually talk about those functions, since all mathematical functions are pure.

IO is one of those computational problem which is hard to model it that way. Most programming languages have a function to get a line of input from the user. How can we we implement such a function in a language which supports only pure functions? Is it even possible to model IO as pure functions?

I would try to model something like that step by step.

Get Line

Let’s consider the case of a programming language which allows only one line of input to be read, that too at the start of the program, and prints out a result at the end. We can then easily model that by making the main function of the program take a string argument and produce a string. Runtime of such a programming language would read a string, pass it on to the function and prints out whatever the function returns.

trait PureIOProgram {
  def main(arg:String):String
}

class Runtime {
  def run(program:PureIOProgram): Unit = {
    println(program.main(new Scanner(System.in).nextLine()))
  }
}

object Runtime {
  def main(args:Array[String]): Unit = {
    new Runtime().run(new SquareNumber())
  }
}

And consider this is the starting point of the program, and everything above is done under the hood for you.

class SquareNumber extends PureIOProgram {
  override def main(arg: String): String = (arg.toInt * arg.toInt).toString
}

Interaction

IO usually is more interactive than that. Like we take input do some processing, get more input or output and in arbitrary combinations of it. How do we model something like that?

We are going to make use of something we did in last example. Instead of returning the string to output, if we can return some datastructure which lets us know whether to take more input or output. This datastructure would let the runtime exactly know what to do next, in what sequence. Like.,

sealed trait IO {}

case class InputLine(fn:String => IO) extends IO
case class OutputLine(value:String, fn:() => IO) extends IO
case object Exit extends IO

InputLine takes a function which takes the read input line and produces more IO actions. OutputLine outputs whatever value is and calls the function for more IO. And Exit simply exits.

Now our interactive program would look like

class SquareNumber extends PureIOProgram {
  override def main = {
    OutputLine("Enter a number: ", () => {
      InputLine(line => {
        OutputLine("The square of the number is " + (line.toInt *line.toInt), () => Exit)
      })  
    })
  }
}

Note that none of the functions, even the anonymous functions inside the IO actions are impure. IO is done in a symbolic way rather than in

And our runtime

class Runtime {
  val scanner = new Scanner(System.in)
  def run(program:PureIOProgram): Unit = {
    exec(program.main)
  }
  
  def exec:PartialFunction[IO, Unit] = {
    case InputLine(fn) => exec(fn(scanner.nextLine()))
    case OutputLine(line, fn) => println(line); exec(fn())
    case Exit =>   
  }
}

Higher order functions

If we look at InputLine and OutputLine, the sequencing logic is almost the same. Once we add more io actions like spinning up threads (Yes we can treat them as IO), leaving that responsibility to individual actions doesnt make lot of sense. But thats the only way of composition we provide. In-order to have better ways compose and have proper responsibility segregation we introduce 2 more IO actions.

case class Sequence[T,K](action:IO[T], fn: T => IO[K]) extends IO[K]
case class Return[T](value:T) extends IO[T]

The use case for Return becomes explicit when we try to compose couple of more IO actions. These 2 operations (Return and Sequence) together form, what is called a Monad. Which is nothing but any structure which has a flatMap and map operation. List, Option, Future etc are all Monad.

After some more restructuring of IO and addition of flatMap and map results in our IO being

case class IOResult[T](value:T)
object IO {
  trait IO [T] {
    def flatMap[L](fn: T => IO[L]):IO[L] = {
      Sequence(this, fn)
    }
    def map[L](fn: T => L):IO[L] = Sequence(this, {v:T => Return(fn(v))})
    def exec:IOResult[T]
  }
  private case object InputLine extends IO[String]  {
    val scanner = new Scanner(System.in)
    override def exec: IOResult[String] = IOResult(scanner.nextLine())
  }
 private case class Sequence[T,K](action:IO[T], fn: T => IO[K]) extends IO[K] {
    override def exec: IOResult[K] = fn(action.exec.value).exec
  }
  private case class OutputLine(line:String) extends IO[Unit] {
    override def exec: IOResult[Unit] = {
      println(line)
      IOResult(Unit)
    }
  }
  private case class Return[T](value:T) extends IO[T] {
    override def exec: IOResult[T] = IOResult(value)
  }

  def printLine(line:String):IO[Unit] = OutputLine(line)
  def getLine:IO[String] = InputLine
  def returnn[T](value:T):IO[T] = Return(value)
}

Note that, we would no longer expose InputLine or OutputLine or Sequence to outside world. We only expose IO interface outside. Hence its safe to have polymorphic behaviours here.And every IO action should do the same too. Since exec is impure, as long as exec isnt directly used by the programs in our language, they remain completely pure.

class SquareNumber extends PureIOProgram {
  override def main = {
    for(
      _ <- printLine("Enter a number:");
      line <- getLine;
      _ <- printLine("The square of the number is " + (line.toInt * line.toInt))
    ) yield ()
  }
}

Scala’s for syntax is transformed to flatMaps except for the last one which get converted to map incase there is an yield. Now our SquareNumber program is lot more easier to read, except ofcourse the _ to eat away the printLine result. Haskell does it much better, with its do syntax we can just ignore results.

Also note that the program is still pure, though it might look like it ain’t. Any reasoning we can apply on pure functions can be applied to any part of the program.

Complete source code is here. Individual sections are available as seperate branches.