grafeno.transformers package

Transformers are one of the key objects of the grafeno library. They are in charge of converting the dependency parse of a sentence, extracted by an external tool, into a grafeno semantic graph.

from grafeno import Graph as CG
from grafeno.transformers import get_pipeline

T = get_pipeline(['pos_extract', 'wordnet', 'unique'])
g = CG(transformer=T, transformer_args={}, text="Fish fish fish fish fish fish fish.")

This process happens in stages. First, morphological nodes are transformed into semantic ones:

Semantic nodes

Semantic nodes are dictionaries with the following attributes:

  • concept: if present, the node will be added to the semantic graph. It represents the main idea, or meaning, of the node. If there is no concept, no semantic node will be produced corresponding to the morphological one.
  • id: a temporal identifier for the node while it is being processed, and hasn’t thus been added to the graph yet. When the node is finally added to the graph, it will be changed to the proper graph ID.

Other attributes in the dictionary are also added to the semantic graph node, and are referred to as grammatemes.

After the nodes have been processed, the dependency relations are transformed into semantic edges:

Semantic edges

Each semantic edge is a dictionary with the following attributes:

  • parent: the (temporal or otherwise) id of the source node.
  • child: the (temporal or otherwise) id of the target node.
  • functor: if present, the edge will be added to the semantic graph. The functor represents the semantic relation between the parent and child nodes.

Other attributes in the dictionary will be also added to the semantic graph edge, and are referred to as grammatemes.

Apart from the main operations of node and edge transformations, there are additional stages in the process where previous or further processing can happen. In order to construct this collection of processing stages, a Transformer object has to be created. For this, a base class is provided, which has methods for the different stages and is in charge of calling them at the right time and with the appropriate arguments.

The way to construct a pipeline is thus to inherit from this base class, and extend the appropriate methods. See its documentation for more information on them.

Additionally, the idea behind transformer classes is that each is supposed to perform a specific operation. This way, a transforming pipeline can be constructed by mixing and matching the desired transformers, by way of creating a class which inherits from them. In order to make this operation easier, a convenience function is provided: grafeno.transformers.get_pipeline, which takes a list of transformers to use, and constructs the appropriate class which inherits from them all in the correct order.

grafeno.transformers.get_pipeline(modules)

Takes a list of transformers and returns a transformer class which subclasses them all