Tuesday, February 3, 2009

Indexing Freemind MindMaps with Alfresco - Alf Hack # 2

The idea of this Alfresco hack is to use a command line tool for text extraction of the Freemind .mm file. Steps to include this into Alfresco will be:
  1. Add Mimetype application/x-freemind for .mm
  2. Add transformer from appplication/x-freemind to text/plain
This article will talk about the second step. For adding a new MIME type please refer to the Alfresco Wiki. The MIME type of Freemind mid maps is application/x-freemind. There is also a nice blog post about adding the freemind MIME type and a nice map integration available.

Extract the text

An example shows how Freemind stores this sample map in a XML file:
<map version="0.7.1">
  <node text="Alfresco Hack No 2">
    <node text="Explore how Freemind XML looks like" position="right">
    </node>
  </node>
</map>
Quite simple XML without namespaces. The text of the map nodes is stored in a the value of the attribute text. To extract the text I will use a quick-and-dirty XSLT:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output omit-xml-declaration="yes" indent="no"/>
    <xsl:template match="/">
     <xsl:call-template name="t1"/>
   </xsl:template>
   <xsl:template name="t1">
     <xsl:for-each select="//node">
       <xsl:value-of select="@TEXT"/>
       <xsl:value-of select="' '"/>
     </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

Throwing this XSLT on the Freemind XML results in the extracted text:
Alfresco Hack No 2 Explore how Freemind XML looks like

Add transformer to Alfresco
To keep things simple, I will use the Alfrescos feature to do content transformations with external tools or programs. This is done by configuring a RuntimeExecutableContentTransformer bean. But first, the command line of the external tool has to be figured out. I will use the xmlstarlet command line tool from http://xmlstar.sourceforge.net/. Depending on your linux distribution the executable will be called just xml or xmlstarlet. There is also a Windows version available from the download page. Transforming the above XSLT to xmlstarlets commandline results in:
xmlstarlet sel -t -m //node -v @TEXT -o ' ' Alfresco\ Hack\ No\ 2.mm
Sadly, the output always go to stdout and no output file can be specified. But this is required for the RuntimeExecutableContentTransformer, so a simple script wrapper can be used. I put the following to a file /home/lothar/bin/freemind2text.sh (made executable with chmod 775) which will be configured to the transformer bean:
#!/bin/bash
# save arguments to variables
SOURCE=$1
TARGET=$2

# to see what gets extracted append arguments to logfile
echo "from $SOURCE to $TARGET" >>/tmp/freemindtransform.log

# call xmlstarlet tool and redirect output to $TARGET
xmlstarlet sel --text --encoding UTF-8 -t -m //node -v @TEXT -o ' ' "$SOURCE" > "$TARGET"
Now we are ready to configure the RuntimeExecutableContentTransformer bean:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">
<beans>
  <bean id="transformer.freemindToText" class="org.alfresco.repo.content.transform.RuntimeExecutableContentTransformer" parent="baseContentTransformer">
    <property name="transformCommand">
      <bean name="transformer.freemind.Command" class="org.alfresco.util.exec.RuntimeExec">
        <property name="commandMap">
          <map>
            <entry key="Linux.*">
              <value>/home/lothar/bin/freemind2text.sh ${source} ${target}</value>
            </entry>
            <entry key="Windows.*">
              <value>...whatever windows needs here....</value>
            </entry>
          </map>
        </property>
        <property name="defaultProperties">
          <props>
            <prop key="options"/>
          </props>
        </property>
      </bean>
    </property>
    <property name="explicitTransformations">
      <list>
        <bean class="org.alfresco.repo.content.transform.ContentTransformerRegistry$TransformationKey">
          <constructor-arg>
            <value>application/x-freemind</value>
          </constructor-arg>
          <constructor-arg>
            <value>text/plain</value>
          </constructor-arg>
        </bean>
      </list>
    </property>
  </bean>
</beans>



Finished!
Now indexing of Freemind mindmaps will take place. On the plus side: No Java coding, just configuration of the standard Alfresco features. On the down side: ...is there anything? Anybody who could contribute the Windows batch file wrapper for the xmlstarlet call?

Monday, February 2, 2009

CMIS Link collection

Random link collection about CMIS: Blogs, Specs, Samples from Alfresco, EMC and others John Newton F2F
http://craigrandall.net/archives/2008/09/cmis/
https://community.emc.com/servlet/JiveServlet/previewBody/1606-102-1-2762/h3951-cmis-wp_2.pdf
http://chucksblog.typepad.com/chucks_blog/2008/09/cmis----its-not.html
https://community.emc.com/docs/DOC-1606

OASIS
CMIS home: http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=cmis
Members: http://www.oasis-open.org/committees/membership.php?wg_abbrev=cmis
JIRA: http://tools.oasis-open.org/issues/browse/CMIS
CMIS TC list:http://lists.oasis-open.org/archives/cmis/
CMIS comments list:http://lists.oasis-open.org/archives/cmis-comment/ http://xml.coverpages.org/cmis.html http://info.emc.com/mk/get/DAP_RE?P.ctp_program_execution.Source_ID=16706
https://community.emc.com/community/labs/cmis
http://roy.gbiv.com/untangled/2008/no-rest-in-cmis
http://intertwingly.net/blog/?q=cmis
http://www-01.ibm.com/software/data/content-management/cm-interoperablity-services.html
http://blogs.msdn.com/ecm/archive/2008/09/09/announcing-the-content-management-interoperability-services-cmis-specification.aspx
http://blogs.msdn.com/ecm/
http://blogs.nuxeo.com/sections/blogs/florent_guillaume/2009_02_02_cmis-meeting-notes
http://blogs.the451group.com/information_management/2008/09/10/cmis-and-industry-standards-in-ecm/
Also a nice link collection on CMIS
http://weblogs.goshaky.com/weblogs/test/search?q=collaboration