Thursday 1 May 2008

OOP and COP

Today I read the article Erlang for the OO-minded on the MUE - Embrace Change blog. Its author sees Erlang processes as kind of objects and writes about similarities.

Encapsulation is a good thing.

To me OOP is a nice way of organizing programs. An object of OOP basically is a compound of data and code components (functions / methods). Plus we got
  • encapsulation (making parts of the objects more local by restricting access),
  • inheritance (a way to organize similar objects) and
  • polymorphism (a way to organize the method calls among similar objects).
The Wikipedia article lists some more properties.

Before this, people came up with other organization schemes, among them ideas like
  • using symbols (use explaining names for code and data entities),
  • macros (aggregate several instructions into one larger),
  • modules (cut the large code files into smaller ones, make the interfaces explicit),
  • algorithms (useful problem solving strategies),
  • data structures (useful organization schemes for data).
All this helped to organize large programs into manageable units.

IMHO OOP benefited from the fact that an OOP object has similarities to physical objects:
  • The data components of an object are similar to the states of a physical object.
  • The code components of an object are similar to certain actions / transformations / processes which apply to physical objects.
We are quite used to cope with objects from the physical world and that helped us to deal with OOP objects. So it is not surprising that for example computer graphics and games, which to a large part are simulations of the physical world, benefited from OOP.

Software patterns are another organizational effort comparable to the invention of data structures for data and algorithms for code. People recognized useful arrangements and usage schemes of objects.

Now lets go to Erlang. The most prominent entities of Erlang are processes. They model tiny units of execution. Each process can be seen as a virtual CPU. The communication between those processes is done by message passing, either asynchronous (don't wait for an answer) or synchronous (wait for an answer).

Processes allow us to split the large execution power of one or several real CPUs into many virtual threads of execution. As an additional treat we get distribution, the ability to do this over several, separated multi CPU machines, connected by a network.

Execution / concurrency is something which is not covered in a nice way by OOP alone. Thus in OOP programming languages it usually shows up somewhere else, like as ugly support for threads or complicated synchronization mechanisms.

It is true what Joe Armstrong wrote in Concurrency Oriented Programming in Erlang:
In languages like (noname) they forgot about
concurrency. It either wasn’t designed in from
the beginning or else it was added on as an
afterthought.
This doesn’t matter for sequential programs.
If your problem is essentially concurrent then this
is a fatal mistake.
Joe likes to call the paradigm behind Erlang Concurrency Oriented Programming (COP):
A language is a COPL if:
  • Processes are truly independent
  • No penalty for massive parallelism
  • No unavoidable penalty for distribution
  • Concurrent behavior of program same on all OSs
  • Can deal with failure
He describes the benefits as:
Why is COP Nice?
  • The world is parallel
  • The world is distributed
  • Things fail
  • Our brains intuitively understand parallelism (think driving a car)
  • To program a real-world application we observe the concurrency patterns = no guesswork (only observation, and getting the granularity right)
  • Our programs are automatically scalable, have automatic fault tolerance (if the program works at all on a uni-processor it will work in a distributed network)
  • Make more powerful by adding more processors
Note that Joe mentions the physical world a few times too.

Thus COP is another way to organize programs, or rather programmed systems, viewed with concurrency in mind. At least Erlang has some more organization in that direction, its OTP libraries and tools, with their patterns and frameworks for concurrency.

Now back to the point of the original MUE article - is an Erlang process related to an OOP object?

My answer is mostly no.

Both are basic building blocks, with boundaries that provide encapsulation, but there it ends. Erlang processes don't carry data themselves and consist of a single method not many.

Actress.

The strongest argument against is probably a look at the Scala language, which features actors. Here is code from Scala Actors - A Short Tutorial:
class Ping(count: int, pong: Actor) extends Actor {
def act() {
var pingsLeft = count - 1
pong ! Ping
while (true) {
receive {
case Pong =>
if (pingsLeft % 1000 == 0)
Console.println("Ping: pong")
if (pingsLeft > 0) {
pong ! Ping
pingsLeft -= 1
} else {
Console.println("Ping: stop")
pong ! Stop
exit()
}
}
}
}
}
This looks somewhat like Erlang code. We recognize the bang (!) operator to send messages and the receive statement. But no tail recursion and pattern matching.

Thus the missing bit what the MUE author was really looking for was the actors model. If you can have that in a functional language like Erlang and an OOP language like Scala, it must be something orthogonal.

Actors seem to be an interesting concept. I have only seen the Actor Model and Actor model and process calculi articles on Wikipedia, plus the interesting Why has the actor model not succeeded? article. It is attributed to a paper from 1973, thus 35 years old.

Erlang is from 1985 thus 23 years old. So far I can't remember any reference by Joe Armstrong of the actor model. I somewhat doubt that the Erlang creators studied the mathematical actor papers while conceiving Erlang and then never giving it credit, so it might be some parallel development.

With Erlang and possibly Scala concurrent programming got a bit more mainstream, I also hope for a broader understanding of the theoretical foundations, to allow better use of those tools. Right now that kind of knowledge is limited to a few specialists.