EMC Developer Network
 


View All Articles

EMC Centera SDK

Posted by Aashish Patil (patil_aashish AT emc.com) on July 22, 2006

A few months back (April to be precise), I attended the EMC World conference in Boston. One of the sessions attended was a session on EMC Centera SDK. I went in with a bit of apprehension because I had absolutely no knowledge about Centera and here I was attending a session that discussed the Centera API. As the session progressed a realization slowly dawned that learning about Centera was not like learning nuclear physics.

Centera is what is called a Content Addressed Storage(CAS). A CAS system creates a digital fingerprint of the data to be stored. This fingerprint helps to ensure that the same data is not duplicated in the storage system. It is also useful in maintaining data integrity by comparing the stored fingerprint value with the value obtained when accessing the content. The fingerprint is generated using hash algorithms such as SHA or MD5. Unfortunately, I don't know the exact algorithm used. Centera is also an object-oriented (model discussed later) location independent data storage system particularly useful for fixed content. Fixed content is any content that is not expected to change often (i.e. archivable content). Examples of fixed content are X-Rays, Images, email archives,...

Centera Model

A Centera archive consists of a cluster that is accessible via the SDK. A cluster is found using IP addresses. A pool is a SDK object that represents one or more clusters.

In the archive, the base object is a C-Clip. A C-Clip is a bundle that contains one or more fixed content pieces. A C-Clip further contains one or more 'tags' and each 'tag' contains one 'blob'. A blob represents a binary object. Both the clip and the tags can be added attributes (name-value pairs) as metadata to describe their contents.

The C-Clip is an in memory representation and when it is persisted to Centera storage, it is saved as a Clip Descriptor File(CDF), which is an XML file.

When a content piece is saved as a 'blob', Centera returns the unique id(digital fingerprint) for that content piece, which is then saved in the containing 'tag'.

Finally, when a clip is saved, Centera returns back a clip id, which can be saved by the client to access clip in the future. For example, this clip id can be saved in a database table.

The SDK

The SDK is written in C. A Java wrapper is also available. The sample in this article uses the Java SDK. The SDK and the programming guides are available for download from the Centera Developer Portal. There are also public Centera pools available if you want to fiddle with the SDK. The information for these pools is available on the portal. This makes it real easy to try out Centera without having to own one.

Typical Sequence

A typical sequence for storing an object goes like this -
Open a connection to a cluster - Pool Open
--Create a C-Clip. This is done in memory.
----GetTopTag
------Create a new Tag
--------Write a BLOB
------Close Tag
----Write Clip
--Close Clip
Close connection to cluster.

Sample

Here is a very simple program that follows the above sequence -

	
	private static String storeFile(String clusterIP, String filepath)
    {
        try
        {
            FPPool fpool = new FPPool(clusterIP);

            FPClip clip = new FPClip(fpool, "my_new_clip");

            clip.setDescriptionAttribute("my_attr1", "My Attr Value");
            clip.setDescriptionAttribute("my_attr2", "attr2 value");

            FPTag topTag = clip.getTopTag();

            FPTag myTag = new FPTag(topTag, "CA1Sunset");
            myTag.setAttribute("my_tag_attr", "tag attr value");
            myTag.setAttribute("filename", "CA1_Sunset.jpg");
            myTag.setAttribute("mydesc", "Sunset picture taken from ca 1");

            topTag.Close();

            FileInputStream fin = new FileInputStream(
                    filepath);

            System.out.println("About to write blob");
            myTag.BlobWrite(fin);
            System.out.println("Blob written");

            
            System.out.println("About to write clip");
            String clipId = clip.Write();
            System.out.println("Wrote clip");
            
            //close 
            myTag.Close();
            fin.close();
            clip.Close();
            System.out.println("Got clip id: " + clipId);
            return clipId;
        }
        catch (Exception ex)
        {
            ex.printStackTrace();
        }
        return null;
    }
	
		
	

Make sure that you have added the folder that contains the Centera native libraries to the PATH variable. In Linux, you might need to add the folder to the LD_LIBRARY_PATH variable.

etc.

CASScript

There is a nice command line tool available on the portal called CASScript. This tool allows you execute api commands without having to write code.

Other Features

Centera also contains some options for enforcing content retention, for optimizing performance for small objects by storing them inline in a tag, authentication profiles and many more. The guides available with the SDK cover these in greater detail.

Resources

Well that is it from me till the next month. Feel free to leave comments/feedback using the comments link below.

Comments / Discussion

View All Articles