Pagine

12 febbraio 2008

Lesson 6: STL containers

In the last lesson (Lesson 5: Polymorphism) we learned what in my opinion is the very heart of object oriented programming.

If you remember well, I was concluding that lesson announcing the subject of the following one to be about template. This seems to be in contrast with the title of this lesson, but it is not!

Template is a very vast subject and before teaching you how you can define your own template class or function, I would like you to be able to use existing templates in the so called Standard Template Library.

The standard template library

In my opinion the STL should be the best friend a newbie in programming. This is a very versatile library profiting from the extensive use of template parameters. We can divide the contents of STL in three parts:
  1. Contaniers class. This is a group of classes that can be used to store any kind of object, from very simple built-in types to complex user defined classes (examples: vector, list, map...). This is the main topic of this current lesson.
  2. Algorithm. The STL also contains a large collection of algorithms to manipulate the contents of container classes. For example, given a sorting function, the content of a vector can be sorted.
  3. Iterators. Having a group of elements, it sounds reasonable to have an easy way to access (sequentially or randomly) to one of the elements. In a very naive way we can say that an iterator stays to a container as a pointer stays to an array.
Let's start from the container classes. Here below you can find a list of all available containers divided into their three main categories.

Sequence containers
: vector, list, deque

Sequence means that the elements within the container are strictly in linear order. The most used of them is the vector because it is very similar to the more familiar array and it is automatically taking care of the dynamic memory allocation. Vectors are very good when the user wants to access an individual element by its position index, iterating over the elements in any order and adding and removing elements at the end.

Compared to arrays, they provide almost the same performance for these tasks, plus they have the ability to be easily resized. Although, they usually consume more memory than arrays when their capacity is handled automatically (this is in order to accomodate for extra storage space for future growth).

Compared to the other base standard sequence containers (deques and lists), vectors are generally the most efficient in time for accessing elements and to add or remove elements from the end of the sequence. For operations that involve inserting or removing elements at positions other than the end, they perform worse than deques and lists, and have less consistent iterators and references than lists.

How does a vector work? Vectors, like all containers, have a size, which represents the amount of elements contained in the vector. But vectors, also have a capacity (be careful not to be mistaken!), which determines the amount of storage space they have allocated, and which can be either equal or greater than the actual size. The extra amount of storage allocated is not used, but is reserved for the vector to be used in the case it grows. This way, the vector does not have to reallocate storage on each occasion it grows, but only when this extra space is exhausted and a new element is inserted. Reallocation may be a costly operation because the operating system has to look for another free contiguous region of the memory large enough to store the enlarged vector and copy there all the existing vector components. For this reason, if the user has already an a priori knowledge of the vector size, it is definitely better to specify it via the reserve method.

How do I use a vector? Good point, have a look at the example code below...

#include <vector>
#include <iostream>

using namespace std;

int main(int, char**) {

   // let's start creating an empty vector of integer
   vector< int > aVectorOfInteger;

   // adding something at the end of the vector is very
   // easy and it can be done using the push_back method
   //
   // So for example in a loop we can have

   const unsigned int n = 100;

   for ( unsigned int i = 0 ; i < n ; ++i ) {

      aVectorOfInteger.push_back( n - i );

   }

   // now that we have something inside we can also
   // read it back using the operator[]

   for ( unsigned int i = 0 ; i < aVectorOfInteger.size() ; ++i ) {

      cout << "The element " << i << " is " << aVectorOfInteger[i]
           << endl;

   }

   return 0;

}


In the previous code we made use of all the most used methods with vectors; in particular look at the use of push_back, size and operator[]. When constructing a vector, the programmer has to specify the type contained into the container. This is done putting the type name in angular bracket (< >) just after the vector keyword. Don't forget to add vector to you include list. When the user call the default vector constructor, this is built empty. If you want to create a vector with n components set to value, you can use this other constructor:



#include <iostream>
#include <vector>

using namespace std;

int main(int,char**) {

   // create a vector with 10 components
   // all of them set to 5.2
   vector< double > aVector( 10, 5.2 );

   // otherwise you can convert an array into a
   // vector doing the following

   const int n = 10;
   double array[n];
   for ( int i = 0; i < n ; ++i )
      array[i] = 1.5 * i ;

   vector< double > anotherVector( &array[0], &array[n] );

   return 0;
}


Here you saw also an interesting trick to convert standard C arrays into more easy to use C++ vector.

When do we have to use list and deque? Another good point. Lists perform generally better in inserting, extracting and moving elements in any position within the container, and therefore also in algorithms that make intensive use of these, like sorting algorithms. The main drawback of list compared to other sequence containers is that they lack direct access to the elements by their position. For example, to access the sixth element in a list one has to iterate from a known position (like the beginning or the end) to that position, which takes linear time in the distance between these.

Deque (usually pronounced like "deck") is an irregular acronym of double-ended queue. Double-ended queues are a kind of sequence containers. As such, their elements are ordered following a strict linear sequence. They provide a similar functionality as the one provided by vectors, but with efficient insertion and deletion of elements also at the beginning of the sequence and not only at its end. On the drawback side, unlike vectors, deques are not granted to have all its elements in contiguous storage locations, eliminating thus the possibility of safe access through pointer arithmetics.

My conclusive words are: use lists when you plan to access all the elements of the container and use vectors when you need to have random access to their components.

Associative containers: (multi-)set, (multi-)map, bitset

Associative containers are containers especially designed to be efficient accessing its elements by their key (unlike sequence containers, which are more efficient accessing elements by their relative or absolute position). In practice, this means that instead of accessing to a particular element of the container using its position (as for vectors), elements of associative containers are accessed through the use of a key.

The most common associative container is a map. In a map, the key value is generally used to uniquely identify the element, while the mapped value is some sort of value associated to this key. Types of key and mapped value may differ. For example, a typical example of a map is a telephone guide where the name is the key and the telephone number is the mapped value. Internally, the elements in the map are sorted from lower to higher key value following a specific strict weak ordering criterion set on construction. If it is possible to have the same key for more than one mapped value, you need to use multimap instead of map. The same consideration apply also for set and multiset. The main difference between a set and a map is that in a set the key and the mapped value are the same object.

Here below an example about the use of maps.


#include <iostream>
#include <map>
#include <string>

using namespace std;

int main(int, char**) {

   // create a map to associate to a person name
   // to its age. Look at the two template parameters
   map< string, int > mapAge;

   // add some friends
   // look at the use of make_pair to create
   // the association
   mapAge.insert( make_pair( "toto", 31 ) );
   mapAge.insert( make_pair( "Marcin", 29 ) );

   // you can also add elements in this way
   mapAge["frisigo"] = 31;

   // now we can access to the mapped value of the map
   cout << " toto is " << mapAge["toto"] << " years old " << endl;


   // another intersing use is the following
   map< string, string > mapPhoneDirectory;

   // add some numbers in this way
   mapPhoneDirectory.insert( make_pair( "toto", "+39 012 34567890") );

   // or in this other way
   mapPhoneDirectory["frisigo"] =  "+39 011 22334455" ;

   // now look on the phonebook for the number for frisigo
   map< string, string >::iterator phoneRecord = mapPhoneDirectory.find("frisigo");


   // if the key you were looking for is not in the container
   // then find returns the end of the container
   if ( phoneRecord != mapPhoneDirectory.end() ) {

      // look at the way we access to the key and the mapped value
      // using the first and second data member
      // remember to derefence the iterator because it
      // behaves like a pointer.
      cout << "The number of user " << (*phoneRecord).first
           << " is " << (*phoneRecord).second << endl;

   }

   return 0 ;
}


A special care has to be reserved for bitset. It is a special container class that is designed to store bits (elements with only two possible values: 0 or 1, true or false ...). The class is very similar to a regular array, but optimizing for space allocation: each element occupies only one bit (which is eight times less than the smallest elemental type in C++: char). The behavior of bit set is very similar to the boolean specialization of vectors.

Containers adaptor: stack, queue and priority queue.

The last kind of container class is not a real container class but it modify the behavior of any of the other containers making them a FIFO (first-in, first-out), a LIFO (last-in, first-out) or a prioritized queue.

Here comes an example on the use of FIFO queues:


#include <iostream>
#include <queue>
#include <string>

using namespace std;

int main(int, char**) {

   // create a queue (FIFO) of strings
   queue< string > waitingLine;

   // now fill in the queue with some people
   // waiting in the line
   while ( waitingLine.size() < 5 ) {
      cout << "Welcome to the waiting line. Enter your name plz... " << endl;

      // get the name from the std input
      string s;
      getline( cin, s );

      // add the current person to the FIFO
      waitingLine.push( s ) ;
   }

   // now serve the queue
   while ( !waitingLine.empty() ) {
      cout << "Now serving: " << waitingLine.front() << endl;        

      // remove the first of the line   
      waitingLine.pop(); 
   }  
   return 0;
} 


Now try to compile and run this code to see how the FIFO queue behaves. A special world is needed by the priority_queue. In this container adaptor the elements are sorted according to a strict weak ordering algorithm, so that the one on the top is not the first pushed in but the one with the highest priority.

This one was a long (but great) lesson, the next one will be about Algorithms and Iterators, so stay tuned and digg this is you like.

Here below some examples you can try on your system.



Chiunque può lasciare commenti su questo blog, ammesso che vengano rispettate due regole fondamentali: la buona educazione e il rispetto per gli altri.

Per commentare potete utilizzare diversi modi di autenticazione, da Google a Facebook e Twitter se non volete farvi un account su Disqus che resta sempre la nostra scelta consigliata.

Potete utilizzare tag HTML <b>, <i> e <a> per mettere in grassetto, in corsivo il testo ed inserire link ipertestuali come spiegato in questo tutorial. Per aggiungere un'immagine potete trascinarla dal vostro pc sopra lo spazio commenti.

A questo indirizzo trovate indicazioni su come ricevere notifiche via email sui nuovi commenti pubblicati.

0 commenti:

Posta un commento

Chiunque può lasciare commenti su questo blog, ammesso che vengano rispettate due regole fondamentali: la buona educazione e il rispetto per gli altri.

Per commentare potete utilizzare diversi modi di autenticazione, da Google a Facebook e Twitter se non volete farvi un account su Disqus che resta sempre la nostra scelta consigliata.

Potete utilizzare tag HTML <b>, <i> e <a> per mettere in grassetto, in corsivo il testo ed inserire link ipertestuali come spiegato in questo tutorial. Per aggiungere un'immagine potete trascinarla dal vostro pc sopra lo spazio commenti.

A questo indirizzo trovate indicazioni su come ricevere notifiche via email sui nuovi commenti pubblicati.