Fleximatcher, a library to help parse natural language

Some months ago, I stumbled across this amazing article about transforming an arbitrary English text in a patent application. The underlying pattern library allows, among other nice things, to find patterns like “The [an adjective] [a noun] and the [a noun]” easily, look for hypernyms (“is a” relationships between expressed concepts, for example “animal” is an hypernym of “cat”) in WordNet, and conjugate verbs in various languages comprehending Italian.

With a library like this extracting information from text becomes a lot easier, but unfortunately there are no exact replacements for this in Java, and the coverage of Italian is partial (no hyperonymy information, a small set of PoS tags), so why not try to create a specific library in Java?

Reinventing the wheel: UIMA and GATE

At first the idea was to integrate a PoS tagger like the one from OpenNLP and match expressions like “The [pos:ADJ] [pos:SS] and the [pos:SS]“, or “[hyp:animal]” allowing the user to easily express patterns and look for them, then I started thinking about additional patterns like “[r:[a-z]+[0-9]]” to match regular expressions, or “[reddit_nick]” to match Reddit usernames, and found that providing a set of core expressions and letting the users define their own could be way more useful.

Moreover, an user may want to define expressions to match common patterns, for example “[r:[1-9][0-9]*] [r:[AP]M]” may be assigned to the tag “hour” in order to directly write expressions like “it’s [tag:hour]” to match “it’s 3 PM“, so user-defined expressions may recursively match patterns with other expressions. This is exactly what generative grammars do, and while we are at it why not let the user defined expression add metadata to the matched text and read metadata added by other rules?

If this does ring a bell, it’s because the idea isn’t new: annotation engines like UIMA and GATE does exactly this.

The problem is that those are both pretty hard to use and configure, though uimaFIT is somehow making things easier, and I wanted a library that could be used immediately and embedded in an application without writing XML configuration files, installing Eclipse plugins or use standalone applications to test the results.

So, a Java 8 library called Fleximatcher was born. The name stands for flexible matcher (my boss says I’v e plenty of fantasy🙂 ) The usage is pretty simple:

  1. add expressions (the pos or hyp in the examples above) and bind them to your annotators, or use built in annotators
  2. add tags, that are expressions matching patterns (the hour tag above)
  3. Ask the library whether a string matches a pattern, and if so which part of the string match which expressions, and which annotations are returned.

JSON is adopted as a format to express annotations, and to define tags, which in the end are rules of a generative grammar, is possible to load a TSV file in this format:

adjective_attribution  [it-pos:Ss] [it-pos:As] {‘name’:#0#,’attribute’:#2#}
information [tag:adjective_attribution] {‘name’:#0.name#,’attribute’:#0.attribute#}

The first rule matches Italian singular names (Ss) followed by Italian singular adjectives (As), and create a JSON string with the keys name and attribute and the name and adjective as values, since #0# is the first matched expression, #1# is the space between them and #2# is the annotation.

The second rule takes what was marked by the first and encapsulate it in another JSON string, taking the attributes from it.

This way is possible, with a single line of a TSV file, to define rules that generate annotations based on annotations from other rules, reading both the annotated text spans and the annotation attributes. It’s also possible to define more complex tags with a custom tag class.

Using about 70 rules it was possible to extract hyponyms, adjective attributions to names and subjects and objects of verb from a whole dump of Wikipedia, a topic I’ll talk about another time.

Parsing with no leftmost derivation

The parser is a top-down one, it tries to match the expressions composing the given pattern and expand tags by matching the correspondent patterns. Since expressions like PoS tags are identified by looking at the surrounding text, and ambiguous grammars are fine (multiple interpretations can be returned), leftmost derivation used by LL parsers is difficult to get.

To be faster the parser uses a trick (two if we consider caching): since annotators can only add annotations, never remove or modify existing ones, if an annotator is called twice on the same text and with the same number of annotations, we can safely assume that the second time it will not retrieve anything, and avoid trying.

This is done by analyzing the chain of annotation handlers. Annotation handlers are for Fleximatcher what the CAS is for UIMA: an element which stores annotations from different annotators.
Fleximatcher annotation handlers are of course a lot simpler, they just allows annotators to store and retrieve annotations which are kept in memory, but offer the possibility of searching for sequences of adjacent annotations and get sub-handlers which are wrappers passed around when expanding an expression to its own pattern.

Each time a tag is asked to annotate a text, it checks whether it was called previously querying the annotators up to the hierarchy and the amount of annotations. If it finds another one with the same number and the same tag expression, it means that due to the deterministic nature of annotators it won’t add annotations and the recursion stops.

This is done not just for efficiency, but also to let the user detect cycles and anomalies int he given grammar: if we can exclude “fake” cycles there’s less noise hiding real ones, which throw specific exceptions when found.

To make the parsing easier, a PoS tagger for Italian, based on OpenNLP and with tens of different words, is already present, and a web application allows to define tags and test patterns on text on the fly, seeing the results in a webpage

The Fleximatcher web interface

The Fleximatcher web interface

The application is just a REST interface to Fleximatcher, and allows to call the library from other languages.

It would be nice to adapt the library to Apache Stanbol enhancement engine, while for UIMA the project Ruta seems a better fit.

Trees shot with the PiNoir sensor, without filters

Create a simple infrared webcam using a Raspberry Pi, Pi noir and Flask

Some months ago I bought a Rapspberry Pi B+ and a Pi noir sensor, with the idea of using it as a small server and take IR pictures.

Once the sensor is attached, is possible to take pictures with the raspistill command

raspistill -o picture.jpg

it has many parameters to regulate exposition, color and image format, but this usage is perfect for most cases.

Since the human eye cannot see within the infrared spectrum (above wavelengths of 750 nm), and consequently computers are not equipped to represent it, the camera has to remap colors to make room for it. The result is that infrared pictures have strange colors, in this case a yellow tone:

The color depends on the kind of room lighting, in this case a fluorescent light which should emit less IR light than an arc lamp.

The PiNoir does not see the full infrared range (which ranges from the 750 nm wavelength of the visible light  to 1mm of the far infrared used by infrared telescopes), in fact it detects the near-infrared range (around 800 nm), so don’t expect anything like the Predator or Hollow man thermal vision, which is based on mid infrared,  but it still produces something interesting when looking at plants:

Trees shot with the PiNoir sensor, without filters

Trees shot with the PiNoir sensor, without filters


The trees with the real colors

The same street with the “real” colors (from Google StreetMap)

The trees and the bushes appear pink, while they are green, and looking at the green building in the corner is clear that is not just a mapping from green to pink. The camera is somehow giving a totally different color only to plants.

This happens because the plants chlorophyll absorbs a good amount of light in the near infrared range, unlike the paint used for the building.

Chlorophyll absorption over the visible light spectrum (click on the image for more details)

This phenomenon is used by both NASA and ESA to determine the health of the vegetation: deforested and dry areas appear of a different color in the near infrared, and different kind of plants have different spectral signatures.

False colour Proba-V image from 4 February 2014 showing deforestation in Argentina (Photo:ESA/VITO)

False colour Proba-V image from 4 February 2014 showing deforestation in Argentina (Photo:ESA/VITO)

The PiNoir sensor is sold with a blue filter, a transparent blue plastic rectangle which filters colors and let us take pictures enhancing the IR spectrum, made specifically to inspect plants health.

The same picture above with the filter looks like this:


Trees shot with the PiNoir sensor, with IR filter

Trees shot with the PiNoir sensor, with the filter on


Now plants appear yellow, and using the Infragram tools we can make the vegetation areas brighter

The picture taken with the filter on, manipulated by infragram

The picture taken with the filter on, manipulated by infragram

If to take pictures I can use SSH, to see them I need to attach the Raspberry to a TV or transfer them using SFTP (I tried to see them in the terminal using ASCII art, but it wasn’t amazing), both uncomfortable solutions.

With a friend we started to develop a small Flask application (here) to run the raspistill command and get the resulting image in a web page.

Flask is a Python framework which allows creating web applications by binding class methods to URL paths and populating Jinja templates to generate web pages.

Among the cool thing of Flask, is how easy it is to create simple API, in this case an API which just takes an infrared picture, saves it in a folder visible from the web and return a JSON response with the name of the file.

Using a small Javascript code is possible to invoke this API (actually, it’s one method) and show the picture in the browser. Since the process takes a few seconds, the user sees a text message with the status.

Once the image is loaded, it’s easy to call the “API” periodically, in order to take a picture and display it every 60 seconds, until the user closes the page. It supports multiple users (the camera will give an error if trying to take two simultaneous pictures, but this will not crash the application).

Now this camera can be used remotely from any browser, including the mobile one.

Eliza, in italiano

locandina polacca di Terminator

“Terminator” in polacco significa apprendista, e non essendo un nome particolarmente accattivante fu diffuso in Polonia con il nome assassino elettronico. La creazione di macchine pensanti è sempre stata vista come imminente dal cinema di fantascienza.

Uno dei primi ricordi che ho dell’informatica in generale è di aver letto, da bambino, una descrizione di Eliza.

Si tratta di un programma creato nel 1966, che cerca di imitare una chat con uno psicologo, ed è probabilmente il primo chatbot nella storia.

Col tempo questo programmino mi ha intrigato sempre più, nel mio immaginario un giorno vicino i computer avrebbero risposto in maniera intelligente, proprio come nei film di fantascienza (invece, il sistema attualmente più sofisticato e costoso crede che Toronto sia negli USA), e l’idea di parlare con un’intelligenza non umana, con una visione del mondo e un modo di pensare “alieni” è decisamente affascinante.

Un’altra cosa notevole di Eliza è che dopo tanto tempo non è ancora stata soppiantata da software molto migliori. Usare abbondanti log di conversazioni reali e database semantici non ha cambiato molto la situazione. Uno si aspetterebbe che con computer migliaia di volte più potenti sotto ogni aspetto e un certo interesse commerciale certi programmi possano vedere la luce, e invece, a distanza di quasi 50 anni, un programma scritto da un singolo individuo su un apparecchio con una potenza di calcolo inferiore a un Nokia 3310 dà ancora le patate a chatbot come Cleverbot.

Un chatbot è stato proprio il primo programma che ho provato a sviluppare quando ho imparato a programmare, e nel corso del tempo ho fatto vari tentativi. In effetti, creare una mia versione di Eliza è stata proprio la prima cosa che ho provato a fare quando ho messo le mani su un ambiente GWBASIC, creando un simulatore del rappresentante d’istituto delle superiori (un minuscolo programma che reagiva ad alcune parole chiave e alla punteggiatura rispondendo con frasi generiche o tipiche dell’individuo in questione).

Recentemente ho fatto un nuovo tentativo di creare un chatbot, stavolta un bot per IRC che usa il parser Fleximatcher (ancora molto incompleto, ma con cui almeno posso fare un po’ di NLP in italiano) per riconoscere delle strutture nel testo e rispondere con frasi apposite, o estraendo informazioni da usare in seguito. Mi sono fermato dopo avere aggiunto la possibilità di riconoscere i saluti (identificando chi saluta chi), il termine “foto a infrarosso” e le sue varianti definite da una piccola grammatica generativa(questo dopo una serie di discussioni sulla fotografia a infrarossi di cui magari parlerò più avanti) e poco altro, perché la varietà del linguaggio naturale è mostruosa e più regole si aggiungono nel tentativo di renderlo più reattivo e più è facile che qualcuna di queste dia dei falsi positivi, problema che fa sì che Conceptnet creda che i metalli aurei siano un tipo di “di solito.
Eliza, insomma, funziona bene perché  è tremendamente difficile ribattere in maniera intelligente e ragionata a una frase ma è molto facile dare risposte generiche, soprattutto se si usa il trucco di interpretare uno psicologo e quindi poter eludere le domande mirate (“cosa ne pensa del Go?”->”crede che la mia risposta sia importante?”).
Ho cercato una versione di Eliza in italiano, e ho trovato solo questo. A parte il fatto che il codice non è disponibile, il dialogo non è stato proprio eccelso:
Io: Ciao
Bot: In Italiano si usa dire “ciao”, ma io preferisco… Hasta la vista, baby!
Io: Do you speak English?
Bot: Non ho niente da dire su YOU SPEAK ENGLISH, prova a dire qualcosa tu…
Io: okay, cosa ne pensi dei salumieri?
Bot: Forse volevi dire CHE NE PENSI DEI GAY…
Considerando che il personaggio di Eliza da cui prende il nome cercava di entrare nell’alta società con modi forbiti, potrebbe almeno dare del lei.
Per la cronaca, il dialogo con Cleverbot è stato solo leggermente migliore:
Bot:How are you?
Io:Fine! Do you speak Italian?
Bot:No I do not.
Io:Okay, What do you think about butchers?
Bot:That you’re not using your whole potential
 Ho trovato poi questo porting di Eliza in Javascript, il quale mantiene abbastanza la semplicità del programma originale, incapsulando le risposte e i pattern a cui sono abbinate in un benedetto file JSON.
Ho tradotto questo file di configurazione in italiano e lo potete trovare qui. Ho cercato di tradurlo direttamente senza modifiche alla logica, ma le differenze tra italiano e inglese rendono necessario qualche ritocco: poiché in inglese sostantivi e aggettivi non hanno quasi mai un genere, bisogna differenziare “mio” e “mia” e prevedere risposte differenziate; in italiano “perché” traduce sia “why” che “because”, quindi non si possono distinguere le domande dalle affermazioni sulle cause senza cercare punti di comanda.
Comunqe, il dialogo con la traduzione di Eliza è stato così:
ELIZA: Come va? Mi dica il suo problema
YOU: do you speak english ?
ELIZA: Parlo solo italiano.
YOU: okay, cosa ne pensi dei salumieri?
ELIZA: Ha mai chiesto a altri ?
YOU: Sì, a un’altra sua traduzione in italiano
ELIZA: Per favore continui.
YOU: ha detto “Forse volevi dire CHE NE PENSI DEI GAY…”
ELIZA: Non è certa ?
YOU: chi?
ELIZA: Perché lo chiede ?
 sembra che alcune domande e risposte vengano ignorate totalmente, probabilmente ho sbagliato qualcosa nello scrivere le regole, ma una funzione molto carina di Eliza è la possibilità di ricordare battute passate:
ELIZA: Per favore, mi dica cosa la turba
YOU: ciao
ELIZA: Salve
YOU: ciao
ELIZA: Salve di nuovo, che cosa la affligge?
YOU: salve
ELIZA: Come va? Per favore, mi dica il suo problema.
YOU: ciao
ELIZA: Ho già detto salve, non abbia reticenze
YOU: do you speak English ?
ELIZA: Parlo solo italiano.
YOU: deutsche ?
ELIZA: Non credo di capire bene.
YOU: sprechen Sie Deutsch?
ELIZA: Le ricordo che non parlo tedesco.
 La capacità di discutere di Eliza non va oltre, sembra che per Terminator dovremo aspettare, ma già così è più gentile e intelligente della stragrande maggioranza dei commenti ai quotidiani online.
pixel values of the street picture

Seam carving in pure Java

Today I tried to implement seam carving in pure Java, without dependencies.

In a few words, it’s a method to automatically reduce the size of an image keeping as much details as possible. How is this accomplished? It’s done by calculating the importance of each pixel with some metric (for example, the gradient magnitude) and removing the less important pixels at first.

The result is made clear by this example from Wikipedia:

The original image, before rescaling. It's the Broadway Tower, Cotswolds, UK.

The original image, before resizing. It’s the Broadway Tower, Cotswolds, UK.

The same image after rescaling.

The same image after resizing.

The person and the castle remained untouched, while the sky and the grass in between has been removed. GIMP has a plugin to do so, some time ago I played with it and had the idea to implement it in pure Java, as a small library.

Here it is, with some usage documentation. It’s single class, with some static methods to change the width of an image.

First, we define a metric to assign each pixel an importance level. A possible criteria is the sum of the differences between the color of the pixel and the surrounding ones. Pixels on the edge of  a part of the image, like the castle or the person, have a least one neighbor with a different color, obtained as the difference of red, green and blue channel values. If the two pixels are expressed as RGB integers this value is calculated as

Math.sqrt(Math.pow((a & 0xFF) – (b & 0xFF),2)
+ Math.pow(((a & 0xFF00) >> 8) – ((b & 0xFF00) >> 8),2)
+ Math.pow(((a & 0xFF0000) >> 16) – ((b & 0xFF0000) >> 16),2))

this value is to number between 0 (same color) and 443 (difference between black and white, since sqrt(3*256^2)=443)

Once we calculate the importance of each pixel, we cannot just remove the N pixel with the lower importance in the row, or we may remove pixels in different areas and get a shifted image like this:

Stupid resizing

The Milan  skyline applying stupid resizing

when we remove a pixel from a row, the others get shifted, and when we delete pixels from different areas the rows are no more aligned and we get this effect; of course the more pixels we remove the worst the image becomes.

To avoid this, we have to remove pixels along a path, so if we remove the one at (x,y) we’ll then proceed to remove the one at (x,y+1) or (x+1,y+1) or (x-1,y+1), not just anyone in the row, preserving the shapes of the image.

To find the path through the image with the lower importance I used this algorithm:

  1. Create a matrix with the same size of the image, we’ll call this matrix minimum cumulative importance, MCI.
  2. Set the bottom row of MCI to 0
  3. For each remaining row from the bottom to the top, calculate the MCI value as the sum of the importance of the corresponding pixel an the minimum importance among the three underlying pixels
  4. Now find the lowest value in the top row, let’s call its column number xl (X coordinate of the Less important), and mark it
  5. At the second row from the top, find the lowest MCI value among the three on column xl-1, xl and xl+1, mark it
  6. Continue iterating over the rows, marking one pixel for each one, always picking among the three pixels right under the one chosen at the previous row

This is the same method used by hidden Markov models to calculate the hidden state sequence, called forward–backward algorithm.

Now, what to do if we want to remove more than one pixel? We can calculate the MCI matrix once and find the shortest paths many times, marking the already found ones and avoiding them when choosing the pixels in the steps 4 and 5.

This doesn’t work: we can have many paths creating Y-shaped pattern which at some point doesn’t leave pixels free for new paths. In this case, we can allow the exception and let the path skip to the less important pixel in that row not marked by other paths, risking the same effect we got on stupid resizing above.

The result is this:

resize with multiple paths and jump

The image resized removing 300 paths, allowing paths to “jump” when other paths took all the available pixels

Better than before, but still awful. The paths tends to be close to each others, as we can see drawing them:

the paths to be removed, marked in negative

The paths to remove from the picture, marked in negative. Do you see the paths appearing from the center of the image? Those have “jumped” and are causing that awful effect

This is the MCI matrix, red pixels are the most important

cumulative importance matrix

The minimum cumulative importance matrix of the Milan skyline picture. The color of a pixel represent the minimum cumulative cost (importance) of the path to reach it from the bottom.

You can note that the triangular “stains” left by the high variance parts: the edges of those triangles are the paths able to bypass them, following a 45° direction.

The solution is to remove one path and recalculate the MCI many times. As you may guess, this is painfully slow, but the result is way better:

resize multiple steps

The Milan skyline resized by recalculating the MCI matrix after each path removal

Images full of straight lines are more difficult to resize, let’s try with my favorite painter, Hieronymus Bosch, and his most famous paint, The Garden of Earthly Delights:

The Garden of Earthly Delights - original

The Garden of Earthly Delights – original

This is the MCI matrix:

The Garden of Earthly Delights - pixel importance

The Garden of Earthly Delights – pixel importance

Unsurprisingly, the borders between the panels are the less important area. And in fact, they get removed

The Garden of Earthly Delights - liquid rescaled

The Garden of Earthly Delights – resized

And here the same example with a street

the original street picture

the original street picture

the resized street picture

the resized street picture

pixel values  of the street picture

pixel values of the street picture