Saturday 8 August 2015

Compute Binomial Coefficient


Crux  of the solution is recalling an important mathematical identity:
nCr = (n-1)Cr + (n-1)C(r-1)

//algo
To find nCk; define array a of size n+1;

a[0]=1;
for (int i = 1; i <= n ; i++) {
 for (int j = min(i,k); j>= 1; j--) {
    a[j] = a[j] + a[j-1];
  }
 }


return a[k];




Levenshtein's Distance

For starters, Levenshtein's distance is the number of edits you need to make to a string A to transform it to string B. The edits could be: removal of a character to A to change it to B, addition of a character to A to change it to B or  simply substituting character in A to the one in B.

Application: Spell Checker

Not many sources explain this algorithm well but I happened to find one that explains it really superbly.

Check this out:
http://oldfashionedsoftware.com/tag/levenshtein-distance/



Friday 17 April 2015

MongoDB backup tipoff

Using MongoDB for distributed storage and management is suitable when it has been identified that the data is largely going to be looked up and occasionally updated but in any case not going to serve online transactions. But, as with any other database system, back-up mechanisms are just as important here.

I have always used mongoexport and mongoimport to export the mongodb collection data and import it back into another collection. But, I didn't know that it was not a reliable means to backup collection data and restore it when needed.

Here is what I observed. Upon mongoexport with version 2.6, some fields get missed out for some documents.  Further, mongoexport doesn't reliably capture data type information either. For example, NumberLong is not imported back as NumberLong when platform changes occur between the machine from where the data is exported and the machine where it is imported back. For my own use case, this later led to data look up issues when using Morphia.

Always use MongoDump and MongoRestore when you want to backup your data and import it back again. I found it to be reliable as it also stores the data type information explicitly.




Sunday 1 March 2015

Validating sequence of intermixed push and pop operations on Stack

I was reading this interesting problem on Stackoverflow:


The problem is not as complex as it seems at first glance. It clearly stipulates a condition for legitimate push: you may only push values in ascending order. 

Here, the key is to look at a sequence, and when you see a number k in it, understand that all numbers 0,1,2,3,..,k (all m < k) get pushed first. That is, you may not push any r > k before k has itself been pushed on to the stack.

With the retention of that simple idea, you can land at the solution pretty easily. 



Thursday 26 February 2015

ORM using Morphia: Tip off

Morphia is a great ORM framework that maps your Java objects to mongo documents.

A general advice would be to refrain from using primitive types like int, long, double in the DO and instead substitute them with their wrapper class counterparts. This is because Morphia defaults values for the fields with primitive types in case no value has been entered for them. This is an undesirable behavior most of the times especially when the default values have other meaning in your domain context.


SpringBatch Chunking: Tip off

I have been working with Spring Batch framework for a while now. As much as I have found it comfortable to work with, there are instances when I get confounded by some interesting batch specific idea.

For instance, when you use chunking for data processing, it is a common thing to use a reader, processor and writer for processing the data chunks. 

Note here that the chunk size is not 1 unless it has been explicitly specified so in the batch description xml.

Remember that exclusivity of private data in the chunk processing pipeline is only between items of different chunks and not between items of the same chunk.

For example:

class MockProcessor implements ItemProcessor <K,V> {

private Map <K,V> cache;

public void Process () {

//use cache
cache.put (k,v);
}

}

Here, the cache contents are private and exclusive only for items belonging to different chunks and not for items in the same chunk. So, if you have any item specific processing to do, remember to set the chunk size to 1 or a better idea would be clear and initialize your cache upon each process invocation.

Wednesday 25 February 2015

Object pooling with valueOf()

We have all wrapped primitive int with wrapper Integer objects.

The regular way to achieve this would be to write:

Integer ONE = new Integer (1);

This would instantiate a new object each time as evidenced by the presence of the new operator. This repeated instantiation is wasteful most of the time.

We can do a one time instantiation of the object and pool the same object to re-use it when it is deemed required later. This may be done with the valueOf method.

Integer ONE = Integer.valueOf(1);

This may be compared to String instantiation by two different techniques:

String happy = "happy"; (pooling)
String happy = new String ("happy"); (non-pooling)