Wednesday, December 12, 2012

Collection Framework: Missing concepts - 2


In my last post, I outline a brief of heart of the collections in java. In this second part of the I am discussing it further. In the post focused on study of container with access point of view.

As we know Container is most commonly used datatype. Each of the datatype provides a uniform set of operations to manage container based on the their internal storage methods. Both object description and storage method in a container can be dynamically changed so that abstract operations can be exactly matched with the run time requirement for operational flexibility and performance.

So, A container data type is an interface to manage collection of objects. An instance of a container datatype is a container. Common data structure to implement container data types are :
  1.  List 
  2. Stack
  3. Queue
  4. Set
  5. Map
Each of the above data structure has unique operational an performance properties. All container data types support the same abstract set of operations
  1. Insert
  2. Delete
  3. Search 
  4. Iterate.


List is by far the most commonly used container datatype. It stores a list of objects of a given type. The stored objects can be accessed by index. Internally, the List is implemented using an array, ensuring that index-based access is very fast.
Lists are typically implemented either as linked lists (either singly or doubly linked) or as arrays, usually variable length or dynamic arrays.


A stack is a basic data structure, where insertion and deletion of items takes place at one end called top of the stack. This structure is used all throughout programming. The basic concept can be illustrated by thinking of your data as a stack of plates or books where you can only take the top item off the stack in order to remove things from it. The two operations applicable to all stacks are:
  • a push operation, in which a data item is placed at the location pointed to by the stack pointer, and the address in the stack pointer is adjusted by the size of the data item;
  • a pop or pull operation: a data item at the current location pointed to by the stack pointer is removed, and the stack pointer is adjusted by the size of the data item.
Stack data structure is  designed as a type of queue; they form Last-In-First-Out (LIFO) queues. 


In Queue data structure the entities in the collection are kept in order and the principal (or only) operations on the collection are the addition of entities to the rear terminal position, known as enqueue, and removal of entities from the front terminal position, known as dequeue. This makes the queue a First-In-First-Out (FIFO) data structure. In a FIFO data structure, the first element added to the queue will be the first one to be removed. This is equivalent to the requirement that once an element is added, all elements that were added before have to be removed before the new element can be invoked.


Set is an abstract data structure that can store certain values, without any particular order, and no repeated values


Map is an abstract data type composed of a collection of (key, value) pairs, such that each possible key appears at most once in the collection.

That is all for now.................

Wednesday, September 26, 2012

Collection Framework: Missing concepts


When I started with Java, my first encounter with the collection framework was on second day only. Collection framework has many implementations. Each of these is solving some very specific need of data structure.
There was a lot of information about every use case and its optimal Collection implementations. But, now after four years, I still have to go through the same process when there is a decision to make about use of collection.
Deciding which collection to use is often one of the most important decisions to make when designing a program.
By definition, a collection provides means for storing and manipulating groups of data as a single unit. This single unit is called a Collection; sometimes also called a container.

The Container 

 In object oriented programming, container is defined as a class capable of storing other objects. These classes usually implement some kind of data structure such as map, set, stacks, etc. The size of the collection of objects is adjusted automatically in a container class.
A Container can be divided into two groups;

  1. Value based containers
  2. Reference based containers

Value based containers 

Value based containers store copies of objects. If we access an object, an object returns a copy of it. If external object is changed after it has been inserted in container, it will not affect the content of the container at all.

Reference based containers 

Reference based containers Store pointers or references to the object. If we access an object, an object returns reference to it. If external object is changed after it has been inserted in container, it will also affect the content of the container.

The Study 

 Further to this the container can be studied under three points of views.


It means accessing the container elements. In case of arrays accessing is done using array index. For stacks, access of elements is done using LIFO (Last In First Out) and in queues it is done using FIFO (First In First Out).


It includes storing of items of containers. Some containers are finite containers and some are infinite containers.


It includes how the item can be traversed.

The Summary 

Now, look for the collection with above concepts imprinted in brain. Selection of the correct container data structures in a program can do wonders.

  1. It can make the code a hundred times faster.
  2. Help in to reduce memory requirements 
  3. To make the source code cleaner (in turn making it easier to write, debug, and expand). 

Typically, the most commonly used data structures are ones which are implementations of abstract data types (Containers). Whereas, an abstract data type (ADT) specifies what operations can be performed on a data structure. A data structure is the concrete implementation of ADT, specifying how much memory is required and, crucially, how fast the execution of each operation will be.
More on this later.......