Saturday, August 05, 2006

Some interesting aspects of programming : Continuations

If you think about what a program performs, we start with an input and events to deal with.
Based on a certain combination of those we will
    -> perform computation
    -> feed the results to other functions who will themselves do the same
So that in concept, we deal with a graph of control flow, and programming is finding a way to transcript that control flow correctly.

Good programming is about to do it in a way that maximise the signal to noise ratio, maximizing expressivenes, and minimizing the plumbering.

The usual programming langage are "stack based", aka they implicitly gives back computed results to the caller. This implicit and uncontrollable link leads to problems when you want to spread a result in different places, as your graph requires, or decide to deliver the results based on a particular event, like what happens with GUI and generally asynchronous programming (exemple)

That's why continuations, by separating computations from the control flow, are sometimes useful.

Sunday, June 25, 2006

Queryable data store

It seems like my shout out for a sweeter handling of runtime queries against data store was on the agenda of the Microsofties Linq development team.

My proposed solution, however, has the extra feature of exposing the datamodel outside of its context, in order to be used for some type helping, for instance. That point is critical for me, as this is the only way to leverage all the work done inside of your software outside of your business scope. In a network economy, this is a crucial point for customers, as a software always need to fit in an ecosystem somehow, and that ecosystem plumbing is where a lot of cost and issue is.

If we go back to Linq, the added value this time is clearly the IQueryable interface.
Wanting to use Linq in a project, I ended up filling the dots exactly on the same place, around this notion of "data repository" with accept queries and translate them. This functionnality is now a standard feature of Linq, with the 2 main additions :

You can generate expression tree (aka build abstract representation of query from unstructured text)

    ParameterExpression p = Expression.Parameter(typeof(Customer), "c");

    LambdaExpression predicate = QueryExpression.Lambda("c.City = 'London'", p);

And turn those trees into actual function :
      Func<Customer,bool> d = predicate.Compile();

Now  you have a whole continuum for your Linq queries, ranging from static to dynamic, and you can handle back some control to your user.
Unfortunately, their only entry point is a ..... untyped string, which is precisely what Linq stands out against.

May be one day Microsoft will realize they need to do for end-user what they ambition with Linq for programmers, that is bring back some strong typing wherever we can, and they'll might end up with my solution and my genius will finally be recognized.

Tuesday, June 06, 2006

Building bridges : follow up

With the help of the illustration of my readers flow, I stressed out in my last post the pure randomness, or close to, some of my readership, who is most welcome nonetheless. If you bear with me, as a continuation this preceding post, I will try to show hows web 2.0 means empowering a very hot subject, mostly underground after long years in the desert of conceptual useless-land, the semantic web, whose european conference is to be held next week in Montenegro. (Now look at the picture and tell me you are not interested by semantic web..) I was previously stressing out the importance of building vocabularies statistically, and to show this I would like to raise the question of why do we know stuff.
  • Well we either identify some relevant concepts, and by intuition formulate an hypothesis on their relationship. then if true, we attached our name to the relationship and become famous, if not rich.
ex : F = m a (Sir Newton)
  • Now you can also, if you have lots of data dealing with those notions of mass and forces, even though their essence is unknow to you, find the same law statistically.
ex : That hammer falling on my foot hurts. Let's try from higher
In matter of human knowledge, a broader subject than 3 variables, now one is going to take on the tasks of describing all the notions in the world. you might consider having experts doing there own domains, but you might need too many people to then connect those domains. So the only viable option is to rely on heavily distributed method. If you add the other constraint that the content producer dont have to maintain themselves the map of the notions, as they dont necesseraly have the time or interest to do so, you have little space but for statistics to rely on... or at least it should provide a great help. This subject of semantic extraction is one of the workshop of european conference, and it summarizes may be better the point (and challenges..)
Mastering the Gap: From Information Extraction to Semantic Representation
Automating the process of semantic annotation of content objects is a crucial step for bootstrapping the Semantic Web. This process requires a complex flow of activities which combines competences from different areas. The Workshop will focus precisely on the interface between the information extracted from content objects (e.g., using methods from NLP, image processing, text mining, etc.) and the semantic layer in which this information is explicitly represented in the form of ontologies and their instances. The workshop will provide an opportunity for: discussing adequate methods, processes (pipelines) and representation formats for the annotation process; reaching a shared understanding with respect to the terminology in the area; discussing the lessons learned from projects in the area, and putting together a list of the most critical issues to be tackled by the research community to make further progress in the area.
The global goal of this semantic effort is certainly not to discourage my beloved reader to come from the finest and highly spirited sites in a random manner. Randomness is something you can always have, and it is very essential to provide you with radical new knowledge. Look just how many people know of the semantic web, and now you might be more of those cutting edge people. The goal of the semantic web is to pursue that discovery of theme on the ground of items which are relevant to certain subjects. A very simple illustration. I discovered yesterday that one notion, called categories, was helping a lot in various, very different subjects I am interested in, like algebra, quantum physics, and computer langage theory. How is it that, being interested in those subjects, I had never heard of categories? Well it might be that for any of those individual domain, they might be a bit too abstract, so it goes under the radar for many people. But when you consider the 3 domain, you are much better off learning categories first and saving lots of time later. Have I had access to a semantic map of those domain, I would have noticed this strange object belonging to all of them, and get informed on it. That's what the semantic web is for, and not for discouraging dear readers to come from other quality blogs ;)

Sunday, June 04, 2006

Web 2.0 : Building bridges

So having proved in my previous post, once and for all, that the web 2.0 concept is manyfold, I can only go on and add one aspect on what the web 2.0 is, also. Many of my readers right now, may be you, are coming from my girlfirend's blog. Her readers are coming from this best of breed blog called inparisnow which has recently been referenced in a wall street journal article. How deceptive is it for a reader of charming (or not) stories about paris to come here and see stuff about "continuations" and "first order logic".... Even to wall street journal readers, whose interests can be close to what my job is, since I happen to work in finance, the journey must be pretty deceiving. Well this recognized fact is precisely what I think is the target for web 2.0 Tagging, social networking, sites mashups, all this is about building semantic groups which relates a specific audience to a specific content. The added twist to this bridge construction is the distributed nature of the semantic group. What look to one reader as "computer theory" will be seen as a "typing system" by another, depending their prevalent knowledge : the thought vocabulary is different between users, and naturally creates groups of users. Within a group sharing a common vocabulary, opinions might diverge on the relative importance of theories, but if you accept the fact that the intragroup variations are smaller than the intergroup variations, then to solve the problem of bringing information is more complex than just a democratic vote, and the solution is not a straight forward one. Having just a popularity system looks like voting for what the result of 2+2 is wont help much.. So after the current ways like tagging, voting, are taken to their technical limits, my bet is that the next challenge will be on how to easily build those 'vocabularies' in a distributed way, through complex statistics tools or method which will be rethought to integrate very easily. Which eventually will bring me to develop the "emergence" section sometime soon :)

Saturday, June 03, 2006

Closure and asynchronous functions

Closures are, in a programming langage, the ability to access at runtime in a local function the objects available at runtime at the point of definition of that local function. For instance void somefunction () { string variable1 = "Hello"; myButton.Click += delegate { myButton.Text = variable1;} } will be legal, even though the button will be clicked way after somefunction() has exited, where the variable variable1 is normally out of scope. This are very handy when dealing with asynchronous programming. Asynchronous programming involves breaking down a single function into a call to a service, and a treatment of the response to that service. Closures enable you to access to a single set of variables, just like in a single function, across all the pieces from the original function. void synchronousfunction() { myquerybutton.status = disabled; results = makebigQuery(); window.display(variable); myquerybutton.status = enabled; } void asynchronousfunction() { myquerybutton.status = disabled; query.oncomplete += new function(results) { window.display(results); myquerybutton.status = enabled; } query.launchquery(); } The code looks very much alike, thanks to the closure. These kind of closure can be found in csharp, javascript 1.5 and many functional programming langages. Now if someone knows an easy way to synchronize my job schedule with my personal one, I'd be glad :)

Sunday, May 28, 2006

websites as graph

An interesting website : "Everyday, we look at dozens of websites. The structure of these websites is defined in HTML, the lingua franca for publishing information on the web. Your browser's job is to render the HTML according to the specs (most of the time, at least). You can look at the code behind any website by selecting the "View source" tab somewhere in your browser's menu HTML consists of so-called tags, like the A tag for links, IMG tag for images and so on. Since tags are nested in other tags, they are arranged in a hierarchical manner, and that hierarchy can be represented as a graph. I've written a little app that visualizes such a graph, and here are some screenshots of websites that I often look at." website:

Saturday, May 27, 2006

The right tools to blog

If you want to set up a blog, I highly recommend you to get : Software
  • Firefox This browser has the particularity of allowing some "extensions" to be added. As it is a browser of choice for many bloggers, there is a very useful ecosystem of handy tools for it that makes it a real pleasure to use. Remember the first time you used Google, you felt that "wow" effect ? That what you'll get with those nice extensions.
  • Performancing : this is a add on to Firefox which helps you post to Blogger, Livejournal, and many other blog services. Very well done. It automates other aspects like postings tags on, technorati, ping services, and other feature
  • For pictures, you can use if you need Allyoucanupload, a straightforward image upload with no registration
  •, used in combination with a display system on your blog, helps you publish specific content on your blog
Categories content on your blog

Wednesday, May 10, 2006

Le Mur de la modelisation

Nous ne pouvons pas, malgré les sommes faramineuses mobilisées, plusieurs centaines de millions d'euros pour des banques en internes, plusieurs milliards d'euros chez des fournisseurs specialisés de logiciels, aboutir à un systeme de gestion integré.

La richesse fonctionnelle à couvrir est enorme. Pour gerer la vie d'un produit que vous achetez dans un reseau bancaire par exemple: "vous beneficiez de 25% de la performance du cac40, plus un rendement garanti de 5% par an"

Il aura d'abord fallu .une demarche commerciale pour ce produit (gestion commerciale avant vente) .donner un prix (manipulation de données historiques, et de modeles), avec eventuellement des allers retours. Une fois les caracteristiques determinées .verrouiller les parametres du prix (hedger) .l'emeteur devra en interne emettre un ensemble de titres representant chacun une composante du produit. ce split est necessaire pour obtenir pour chaque produit les meilleures conditions de prix, ou satisfaire à des obligations juridiques. .Ce produit genere pour l'emetteur des expositions en change, en taux, en volatilité, et d'autres parametres .Pour couvrir ces parametres, il faut placer des ordres à des heures precises, correspondant au produit vendu .Les evolutions des parametres financiers genere des gains ou des pertes au niveau de l'emetteur, qui doit rapporter celles ci au niveau de sa hierarchie, à intervalles reguliers, puis au niveau comptable au comissaire aux comptes. .Les cellules de controle de risque internes monitorent aussi en permanence les evolutions des engagements pris, et surtout potentiels....

Etc, etc...

Pourquoi cet exemple est interessant? Parce que derriere ces problemes, il existe des enjeux financiers importants. Par ailleurs, ces enjeux existent de maniere concentrés chez plusieurs acteurs qui ont des moyens importants à mettre en face. Les conditions sont donc ideales.

Or il y a echec.

Le mur de la modelisation est atteint.

A fortiori, d'autres domaines qui sont confrontés à une complexité fonctionnelle aussi riche, mais dont les interets à mettre en place une solution sont dispersés, ou qui n'ont tout simplement pas ces moyens n'ont aucune chance de voir une solution informatique gerer leur domaine fonctionnel.

Revenons aux organisations qui ont donc le plus d'atouts de leur coté dans le cadre actuel : D'une part le mur de la modelisation est atteint. D'autre part, cette meme modelisation est à l'heure actuelle requise pour lancer des projets, car il y a des couts d'agence. un service sous traite à un autre la realisation d'un logiciel, il faut donc valider la facture generée, et pour ce faire, avoir prevu au prealable, et donc modelisé...

Nous sommes confrontés à un deadlock organisationnel.

Il faut donc changer d'organisation. Ce ne sont pas les autorisation pour engager des depenses en externe au nom d'un groupe d'interet qui vont disparaitre. Il va donc falloir trouver des moyens plus souples de developper des solutions, de les mettre en oeuvre, de les scaler, etc..

Il n'y a pas d'autres options que celle ci si l'on souhaite aller au dela du cycle logiciel actuel qui reste excessivement faible en dehors des grands systemes comme windows, chez les "vrais" utilisateurs d'informatique.

C'est pour cela qu'à travers toute les technologies mises en place, je vois celle qui favorisent et structurent l'emergence de la modelisation comme celles qui sont importantes.

La realisation est bien plus facile. Individuellement, les differents elements mis en oeuvre dans le modele complexe presenté de gestion d'un produit financier sont presque triviaux. collectivement ils deviennent tres compliqués. Architecturer ceux-ci pour integrer les changements sans bureaucratie excessive est impossible.

ps: Concernant les tags, ceux ci sont utilisés depuis fort longtemps par Reuters, qui classe ses depeches ainsi, par langue, pays, theme, etc.. Un article sur une emission d'emprunt en pologne interesse aussi bien le marché du credit que celui des taux d'interets, que les banquiers conseils etc.. A contrario, une news sur le president polonais qui se rend en allemagne limite l'interet des recherche par mots clefs..

Le probleme est que si Reuters utilise tres bien ces tags, les categories, le meta modele, est defini... par reuters! C'est la que se situe l'innovation des tags version web : la possiblité de se rassembler dans des categories communes, mais aussi la possiblité de se disperser pour discuter avec d'autres experts de choses qui n'existent pas dans l'esprit d'autres contemporains.

Monday, May 08, 2006


Les tags ne sont qu'une premiere etape. Quel est l'interet des tags en realité? A mon point de vue, ils sont interessants car ils sont l'outil d'une organisation emergente. A contempler mon terminal Reuters et son interface des années 90, je me dis que le systeme des "topic" n'est pas si novateur qu'il y parait sur le web ! L'innovation est que tout un chacun a la possibilité de se construire son propre dictionnaire, et d'y contribuer. L'innovation n'est pas dans l'information, mais dans la plateforme qui permet l'emergence. Les tags constitue une plateforme (perfectible) pour faire emerger des ontologies. Le futur consiste à extraire les conditions de l'emergence et à s'appliquer à les reproduire dans d'autres domaines, fonctionnellement plus riches que de simples topics. Etendons ce concept et nous permettrons par exemple un jour de faire emerger des applications. Travaillant en salle de marché, domaine friand de technologie de l'information, on ne peut que constater l'incapacité des processus de formalisation traditionnel. Plutot que de se battre contre cela, fabriquer un pont entre l'informel (spreadsheet excel, mail...) et le formel (worflow, BPM...) semble une solution. Quelles sont donc les conditions de l'emergence? A mon sens, il faut que les interlocuteurs partagent un modele de donnée. Dans le cas des tags, il s'agit d'un "topic", objet simple s'il en est, et canoniquement interpretable... Il est par ailleurs structurellement logique que derriere l'emergence se niche tout une richesse applicative puisque le modele precedent de developpement logiciel suppose la delegation entre l'utilisateur final et le realisateur. Et le cout d'agence associé (pas forcement que monetaire) a forcement occulté tout un pan applicatif jugé non critique pour les business lines mais radicalement novateur collectivement... Si ces sujet vous interessent, n'hesitez pas à reagir/me contacter pour que l'on en discute online ou de vive voix.

Sunday, May 07, 2006

Compilation reloaded

If you are thinking about expressions, abstract syntax tree, dynamic langages... here is an interesting representation to guide your thoughts. It does to performance what you can do for other properties. (From F# presentation)