These notes are intended for use by students in cs1501 at the University of Pittsburgh and no one else



Download 186.06 Kb.
Page5/24
Date03.05.2017
Size186.06 Kb.
1   2   3   4   5   6   7   8   9   ...   24

de la Briandais Trees

  • Idea of de la Briandais Trees (dlB)
    • Now, a "node" from multiway trie and Patricia will actually be a linked-list of nodes in a dlB
      • Only pointers that are used are in the list
        • Any pointers that are not used are not included in the list
      • For lower levels this will save an incredible amount
    • dlB nodes are uniform with two references each
      • One for sibling and one for a single child
  • de la Briandais node
  • character (bit pattern)
  • ref to child node (next level)
  • ref to sibling node (same level)

de la Briandais Trees

    • For simplicity of Insert, we will also not have keys stored at all, either in internal or external nodes
      • Instead we store one character (or generally, one bit pattern) per node
      • Nodes will continue until the end of each string
        • We match each character in the key as we go, so if a null reference is reached before we get to the end of the key, the key is not found
        • However, note that we may have to traverse the list on a given level before finding the correct character
    • Look at example on board

de la Briandais Trees

    • Run-time?
      • Assume we have S valid characters (or bit patterns) possible in our "alphabet"
        • Ex. 256 for ASCII
      • Assume our key contains K characters
      • In worst case we can have up to Θ(KS) character comparisons required for a search
        • Up to S comparisons to find the character on each level
        • K levels to get to the end of the key
      • However, this is unlikely
        • Remember the reason for using dlB is that most of the levels will have very few characters
        • So practically speaking a dlB search will require Θ(K) time

de la Briandais Trees

    • Implementing dlBs?
      • We need minimally two classes
        • Class for individual nodes
        • Class for the top level DLB trie
      • Generally, it would be something like:
  • Java
  • C++
  • public class DLB {
  • private DLBnode root;
  • // constructor plus other
  • // methods
  • private class DLBnode {
  • public char value;
  • public DLBnode rightSib;
  • public DLBnode child;
  • }
  • }
  • class DLBnode
  • {
  • public:
  • char value;
  • DLBnode * rightSibling;
  • DLBnode * child;
  • }
  • class DLB
  • {
  • private:
  • DLBnode * root;
  • public:
  • // constructor plus other
  • // methods
  • }

de la Briandais Trees

      • Search Method:
        • At each level, follow rightSibling references until
          • character is matched (PROCEED TO NEXT LEVEL) or
          • NULL is reached (string is not found)
        • Proceed down levels until "end of string" character coincides with end of key
          • For "end of string" character, use something that you know will not appear in any string. It is needed in case a string is a prefix of another string also in the tree
      • Insert Method
        • First make sure key is not yet in tree
        • Then add nodes as needed to put characters of key into the tree
          • Note that if the key has a prefix that is already in the tree, nodes only need to be added after that point
          • See example on board

de la Briandais Trees

      • Delete Method:
        • This one is a bit trickier to do, since you may have to delete a node from within the middle of a list
        • Also, you only delete nodes up until the point where a branch occurred
          • In other words, a prefix of the word you delete may still be in the tree
          • This translates to the node having a sibling in the tree
        • General idea is to find the "end of string" character, then backtrack, removing nodes until a node with a sibling is found
          • In this case, the node is still removed, but the deletion is finished
          • Determining if the node has a sibling is not always trivial, nor is keeping track of the pointers
        • See example on board

de la Briandais Trees

    • Also useful (esp. for Assignment 1)
      • Search prefix method
        • This will proceed in the same way as Search, but will not require an "end of string" character
        • In fact, Search and Search prefix can easily be combined into a single 4-value method:
          • Return 0 if the prefix is not found in the trie
          • Return 1 if the prefix is found but the word does not exist (no "end of string" character found)
          • Return 2 if the word is found
          • Return 3 if the word is found and it is a prefix
        • This way a single method call can be used to deter-mine if a string is a valid word and / or a valid prefix
        • For maximum credit, this approach must be used in Assignment 1 Part B

More Searching

  • So far what data structures do we have that allow for good searching?
    • Sorted arrays (lgN using Binary Search)
    • BSTs (if balanced, search is lgN)
    • Using Tries (Theta(K) where we have K characters in our string)
  • Note that using Tries gives us a search time that is independent of N
    • However, Tries use a lot of memory, especially if strings do not have common prefixes

Searching

  • Can we come up with another Theta(1) search that uses less memory for arbitrary keys?
  • Let's try the following:
    • Assume we have an array (table), T of size M
    • Assume we have a function h(x) that maps from our key space into indexes {0,1,…,M-1}
      • Also assume that h(x) can be done in time proportional to the length of the key
    • Now how can we do an Insert and a Find of a key x?

Download 186.06 Kb.

Share with your friends:
1   2   3   4   5   6   7   8   9   ...   24




The database is protected by copyright ©sckool.org 2020
send message

    Main page