Wednesday, December 15, 2010

Abstract Datatype

Linked list
struct node{
int node;      //datapart
struct* node;  //pointer to next node structure
}
There is one head pointer pointing to node structure but treated specially  Its allocated on stack while these nodes are stored on heap with dynamic memory allocation call to malloc(int)

isEmpty(struct node*)
Input: heap pointer
If head pointer is null
 true
else
 false

int Length(struct node * header)
Input head pointer
create another duplicate pointer instead of  changing the head pointer because we donot want to lose start of list call it current pointer to struct node
here header and current pointers are local to this function only

while(!isEmpty)
count++
current= current ->next

InserAfter ( struct node ** head pointer, newdata) //insert new element at beginning of list
// head pointer by reference instead by value
allocate newNode on heap
struct node * newNode = malloc(sizeof(struct node));

here head points to the actual head pointer instead to first node


set data newNode->data = newdata
set next pointer link of newNode to remaining list newNode ->next = * head
set head of list to newNode , if we had passed copy of pointer we wouldn't be able to change it
*head = newNode
we change the head pointer only at last because we donot want to lose
call to it would be like InsertFirst(&head,data)
where head in the calling program would be defined like struct node * head

changing a pointer with reference pointer
we pass a pointer with one extra star like we did above struct node **

Common mistakes in Linked list
1)passing pointer by value when updating the head in the function
2)

Deleting Max from linked list
can be done in single pass if we maintain the maxprevious else would need two pass
Binary search on linked list
would take O(nlogn) which is more than on array(logn)


remove duplicates from the unsorted array and leave the order of numbers same
like  2 1 8 2 7 4 8 should be 2 1 8 7 4
can be done in O(nlogn) if we attach a key with all the numbers to keep track of order
first we sort according to value then O(nlogn)
remove duplicates O(n) leave keys with smaller number
sort according to keys O(nlogn)

size of stack when function is should depend on the depth of function calls
that is for permutation number of call is proportional to n
for fact also number of calls is proportional to n
when there is function call then address and variable are piled on stack and new function is free to change the registers
so when permutation(10) is called it will pile the address and variable for the main function
and it will call permute reursively until permute(1) in which it doesnot need to call any function
 _________
main call     |
------------- 
permute 10 |
------------
permute9    |
 ------------
 ________
permute2   |
------------

so 9 is the depth for the function and first call piles main block address and register values on stack

Polish Notation
prefix notation
operator before operands

Reverse Polish Notation
every operator follows its operand


Infix to postfix conversion
values are moved to output queue
when operator has higher precedence than operator on top of stack push it on to stack
when operator have same precedence as top of stack and its left associative pop top of stack and push new operator
when operator has same precedence and is evaluated from right push it
when left parenthesis is found push onto stack
when right parentesis is found pop stack top to output until left parenthesis found and discard both left and right


stack error
underflow trying to pop empty stack and overflow trying to push onto already full
array can be shared to allow two stack implemented in same
Queue

FIFO
The queue has head and tail
element is added at tail and removed from head
head(Q) indexes head of the queue that is the fist element in fifo sequence
tail(Q) points to next location at which new data will be inserted, one ahead of last lement
we wrap around when tail reaches the end of the queue + 1

Enqueue operation inserts element at the tail and moves tail ahead by one to point to next empty
Dequeue operation removes from the head and moves head to point to next element

In circular queue when head points to the first element and tail points to the last element there is problem to check the empty and full condition because when
tail = front - 1  it can be empty or can be full
solution to this problem involves using additional varible which keeps count of number of elements or
keeping a gap between head and tail by one null space


Stack with one queue (circular)
operation we need push and pop
order we need is LIFO
adding element to queue should not be problem because we can enqueue element at tail now to delete
we can delete from front only by dequeue operation so we need to move all elements after rear that is dequeue each element until last, remaining list will be in same order as before

Queue with two stacks
Enqueue operation is simple
push the new element on top of stack1
Its of O(1)

Dequeue operation
we can pop only top of the stack which is last element but for dequeue we need the last element of the stack that is first element inserted so we have to pop(O(1)) all the element from stack1(O(n)) and push(O(1)) them on to stack2 then pop the first element from the stack2 and again push all remaining element onto stack1
Time complexity is O(n)


Hash Table
data structure for implementing dictionary(Insert,delete search)
Worst can is O(n) for searching  but normally it achieves O(1)

Direct Address Table
works well when the number of distinct keys are small
element is stored at location indexed by its key
that is element a with key k is stored in table T as address T(k)
Dictionary operations take O(1) time


Hash table
element a with key k is stored at location h(k) where h is hash function
hash function maps the universe of keys to the slots of the table T
here we are reducing range of array indices which in the previous case was T[U] and in this case would be
T[1...m]
two keys may hash to same slot, that is,h[k2]=h[k3] can happen this is called collision

Collision Resolution by chaining
We put all the elements that hash to same slot in linked list
search in this would be like : check the list T[h(k)] for  element, time complexity depends on the length of list
Insertion : insert at head of list T[h(k)]   O(1) in worst case
Deletion: given an element x to delete from table T
For doubly linked list its O(1) we can set the predecessor pointer without traversing
For single inked list it would take traversing to find the predecessor to set

Consider Hash table T with m slots that stores n elements
load factor a =  n / m , average number of elements stored in a chain
Worst case
when all elements maps to same index in table and creates a long chain
for searching it would be then Th(n) + time to compute hash function

Average case
we consider simple uniform hashing
Probability that two keys would hash to same location is equally likely that is 1/m

searching would take O(a+1)

when number of elements n in the table is proportional to hash table slots that is
n = O(m)
a= O(m)/m = O(1)
and searching would take O(1)

Problem what is expected number of collisions for a pair of distinct keys, assuming uniform hashing.
Solution
there are total C(n 2) key pairs
For each pair
probability of collision  P(c)= 1/m because of uniform hashing
X be the random variable which is equal to 1 in case of collision that is when (h(k1) = h(k2))
we can find the Expected value for collision over all keys
C(n 2)(1*p(c))
= n(n-1)/2m

Hash Function
has function should be independent of the keys 
Division method
h(k) = k mod m
fast only single division is required
If m is power of 2, 2^p than h(k) would consider only p lower order bits of k its not good choice
A prime not too close to power of 2 is good choice

Multiplication Method


Open addressing
no linked list for each slot only, single element in each slot
load factor =1
avoids any pointers that memory is used in more slots
we compute the sequence of nodes to examine instead of
 Insertion
we find an empty slot by probing
the probe order is not be sequential it depends on the key
for each key we have a probe sequence like
{h(k,1),h(k,2) ..........h(k,m-1)}
this probe sequence should be permutation of 1 2 3 m-1 m so that all slots are considered 

Hash-Insert(T,k)
for each probe sequence j = h(k,i)
if no element at that location insert it
else probe next

search probes the same sequence as used in the insertion
Has-Search(T,k)
search all probe sequence until k is found
we can improve this by making delete function specially so that we stop search when we find first null in the probe sequence

Delete
when we delete an element we mark it with something instead of assigning null

collision technique is preferred when keys must be deleted that is more efficient than open addressing
For uniform hashing each permutation generated by probing sequence should be equally likely
any probing technique which generates near to m! probing sequence is better

Linear Probing
probing sequence is linear that is h'(k) h'(k)+1 .... m-1 0 1 2 .. k-1
that is
h(k,i) = (h'(k) +i )mod m
for i = 0 1 2 ....m-1
there are m different probe sequence possible.
easy to implement but primary clustering the series of occupied slots will increase in order, increasing the search time

Quadratic Probing
h(k,i) = (h'(k)+ c1i+c2i^2)
first slot probed is h'(k)

No comments:

Post a Comment