Algorithms and Analysis cs 2308 Foundations of cs II algorithms



Download 23,31 Kb.
Date conversion25.04.2017
Size23,31 Kb.

Algorithms and Analysis

  • CS 2308 Foundations of CS II

Algorithms

  • Step-by-step instructions that tell a computing agent how to solve some problem using only finite resources
  • Resources
    • Memory
    • CPU cycles
      • Time/Space
  • Types of instructions
    • Sequential
    • Conditional
      • If statements
    • Iterative
      • Loops

Pseudocode: The Interlingua for Algorithms

  • an English-like description of the sequential, conditional, and iterative operations of an algorithm
  • no rigid syntax. As with an essay, clarity and organization are key. So is completeness.

Pseudocode Example

  • Pseudocode Example
  • Find Largest Number
  • Input: A list of positive numbers
  • Output: The largest number in the list
  • Procedure:
  • 1. Set Largest to zero
  • 2. Set Current-Number to the first in the list
  • 3. While there are more numbers in the list
  • 3.1 if (the Current-Number > Largest) then
  • 3.1.1 Set Largest to the Current-Number
  • 3.2 Set Current-Number to the next one in the list
  • 4. Output Largest
  • conditional
  • operation
  • Pseudocode Example
  • Find Largest Number
  • Input: A list of positive numbers
  • Output: The largest number in the list
  • Procedure:
  • 1. Set Largest to zero
  • 2. Set Current-Number to the first in the list
  • 3. While there are more numbers in the list
  • 3.1 if (the Current-Number > Largest) then
  • 3.1.1 Set Largest to the Current-Number
  • 3.2 Set Current-Number to the next one in the list
  • 4. Output Largest
  • iterative
  • operation
  • Pseudocode Example
  • Find Largest Number
  • Input: A list of positive numbers
  • Output: The largest number in the list
  • Procedure:
  • 1. Set Largest to zero
  • 2. Set Current-Number to the first in the list
  • 3. While there are more numbers in the list
  • 3.1 if (the Current-Number > Largest) then
  • 3.1.1 Set Largest to the Current-Number
  • 3.2 Set Current-Number to the next one in the list
  • 4. Output Largest
  • Let’s “play computer” to review this algorithm…

Algorithms vary in efficiency

  • example: sum the numbers from 1 to n
  • space requirement is constant (i.e. independent of n)
  • time requirement is linear (i.e. grows linearly with n). This is written “O(n)”
  • efficiency
  • space= 3 memory cells
  • time = t(step1) +
  • t(step 2) +
  • n t(step 4) +
  • n t(step 5)
  • 1. Set sum to 0
  • 2. Set currNum to 1
  • 3. Repeat until currNum > n
  • 4. Set sum to sum + currNum
  • 5. Set currNum to currNum + 1
  • Algorithm I
  • to see this graphically...

Algorithm Is time requirements

  • 0 1 2 … n
  • size of the problem
  • time
  • y = mx + b
  • time = m n + b
  • The exact equation for the line is unknown
  • because we lack precise values for the constants m and b.
  • But, we can say:
  • time is a linear function of the size of the problem
  • time = O(n)

Algorithm II for summation

  • The “key insight”, due to Gauss:
  • the numbers can be grouped into
  • 50 pairs of the form:
  • 1 + 100 = 101
  • 2 + 99 = 101
  • . . .
  • 50 + 51 = 101
  • First, consider a specific case: n = 100.
  • }
  • sum = 50 x 101
  • This algorithm
  • requires a single
  • multiplication!
  • Second, generalize the formula for any (even) n :
  • sum = (n / 2) (n + 1)
  • Time requirement is constant.
  • time = O(1)

Sequential Search: A Commonly used Algorithm

  • Suppose you want to a find a student in the Texas State directory.
  • It contains EIDs, names, phone numbers, lots of other information.
  • You want a computer application for searching the directory: given an EID, return the student’s phone number.
  • You want more, too, but this is a good start…

Sequential Search of a student database

  • algorithm
  • to search database
  • by EID :
  • 1. ask user to enter EID to search for
  • 2. set i to 1
  • 3. set found to ‘no’
  • 4. while i <= n and found = ‘no’ do
  • if EID = EIDi then set found to ‘yes’
  • else increment i by 1
  • 7. if found = ‘no’ then print “no such student”
  • else < student found at array index i >
  • .
  • .
  • .
  • .
  • .
  • .
  • name EID major credit hrs.
  • John Smith
  • Paula Jones
  • Led Belly
  • Chuck Bin
  • JS456
  • PJ123
  • LEB900
  • CB1235
  • .
  • .
  • .
  • .
  • .
  • .
  • physics
  • history
  • music
  • math
  • 36
  • 125
  • 72
  • 89
  • 1
  • 2
  • n
  • 3

Time requirements for sequential search

  • average case (expected amount of work):
  • EID found in studentn/2
  • n/2 loop iterations
  • because the amount of work is a constant multiple of n,
  • the time requirement is O(n)
  • in the worst case and the average case.
  • amount
  • of work
  • best case (minimum amount of work):
  • EID found in student1
  • one loop iteration
  • worst case (maximum amount of work):
  • EID found in studentn
  • n loop iterations

O(n) searching is too slow

  • Consider searching Texas State’s student database using
  • sequential search on a computer capable of
  • 20,000 integer comparisons per second:
  • n = 150,000 (students registered during past 10 years)
  • average case
  • 150,000 comparisons 1 seconds = 3.75 seconds
  • 2 20,000 comparisons
  • x
  • worst case
  • 150,000 comparisons x 1 seconds = 7.5 seconds
  • 20,000 comparisons
  • Bad news for searching NYC phone book, IRS database, ...

Searching an ordered list is faster: an example of binary search

  • .
  • .
  • .
  • .
  • .
  • .
  • name student major credit hrs.
  • John Smith
  • Paula Jones
  • Led Belly
  • Chuck Bin
  • 24576
  • 36794
  • 42356
  • 93687
  • .
  • .
  • .
  • .
  • .
  • .
  • physics
  • history
  • music
  • math
  • 36
  • 125
  • 72
  • 89
  • 1
  • 2
  • n
  • 3
  • student
  • 24576
  • 36794
  • 38453
  • 41200
  • 43756
  • 45987
  • 47865
  • 49277
  • 51243
  • 58925
  • 59845
  • 60011
  • 60367
  • 64596
  • 86756
  • 93687
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • note: the
  • student array
  • is sorted in
  • increasing
  • order
  • how would you
  • search for 58925 ?
  • 38453 ?
  • 46589 ?
  • Probe 1
  • Probe 2
  • Probe 3

The binary search algorithm

  • 1. ask user to input studentNum to search for
  • 2. set found to ‘no’
  • 3. while not done searching and found = ‘no’
  • 4. set middle to the index counter at the middle of the student list
  • 5. if studentNum = studentmiddle then set found to ‘yes’
  • 6. if studentNum < studentmiddle then chop off the last half
  • of the student list
  • 7. If studentNum > studentmiddle then chop off the first half
  • of the student list
  • 8. if found = ‘no’ then print “no such student”
  • else
  • How?
  • How?
  • What does this mean?

The binary search algorithm

  • assuming that the entries in student are sorted in increasing order,
  • 1. ask user to input studentNum to search for
  • 2. set beginning to 1
  • 3. set end to n
  • 4. set found to ‘no’
  • 5. while beginning <= end and found = ‘no’
  • 6. set middle to (beginning + end) / 2 {round down to nearest integer}
  • 7. if studentNum = studentmiddle then set found to ‘yes’
  • 8. if studentNum < studentmiddle then set end to middle - 1
  • 9. if studentNum > studentmiddle then set beginning to middle + 1
  • 10.if found = ‘no’ then print “no such student”
  • else

Time requirements for binary search

  • At each iteration of the loop, the algorithm cuts the list
  • (i.e. the list called student) in half.
  • In the worst case (i.e. when studentNum is not in the list called student)
  • how many times will this happen?
  • n = 16
  • 1st iteration 16/2 = 8
  • 2nd iteration 8/2 = 4
  • 3rd iteration 4/2 = 2
  • 4th iteration 2/2 = 1
  • the number of times a number n
  • can be cut in half and not go below 1
  • is log2 n.
  • Said another way:
  • log2 n = m is equivalent to 2m = n

This is a major improvement

  • n sequential search binary search
  • O(n) O(log2 n)
  • 100 100 7
  • 150,000 150,000 18
  • 20,000,000 20,000,000 25
  • 27=128
  • 218=262,144
  • 225 is about
  • 33,000,000
  • number of comparisons
  • needed in the worst case
  • 150,000 comparisons x 1 second = 7.5 seconds
  • 20,000 comparisons
  • in terms of seconds...
  • 18 comparisons x 1 second > .001 seconds
  • 20,000 comparisons
  • sequential search:
  • binary search:
  • vs.

Sorting a list

  • First, an algorithm that’s easy to write, but is badly inefficient...
  • unsorted
  • sorted
  • 35467
  • 67854
  • 46781
  • 13528
  • 87341
  • 1
  • 2
  • 3
  • 4
  • 5
  • initially
  • unsorted
  • sorted
  • 35467
  • 67854
  • 46781
  • 13528
  • 87341
  • 1
  • 2
  • 3
  • 4
  • 5
  • 13528
  • 1st iteration
  • unsorted
  • sorted
  • 35467
  • 67854
  • 46781
  • 13528
  • 87341
  • 1
  • 2
  • 3
  • 4
  • 5
  • 13528
  • 35467
  • 2nd iteration
  • unsorted
  • sorted
  • 35467
  • 67854
  • 46781
  • 13528
  • 87341
  • 1
  • 2
  • 3
  • 4
  • 5
  • 13528
  • 35467
  • 46781
  • etc.
  • 3rd iteration

The “Simple Sort” Algorithm

  • 1. set i to 1
  • 2. repeat until i > n
  • set indexForSmallest to the index of the smallest
  • positive value in unsorted
  • 4. set sortedi to unsortedindexForSmallest
  • 5. set unsortedindexForSmallest to 0
  • 6. increment i
  • given a list of positive numbers, unsorted1, …, unsortedn and
  • another list, sorted1, …, sortedn, with all values initially set to zero

This algorithm is expensive!

  • Time requirement
  • total time = n iterations x time per iteration
  • time per iteration = time to find smallest value in a list of length n
  • = O(n)
  • total time = n x O(n)
  • = O(n2)
  • Space requirement
  • total space = space for unsorted + space for sorted
  • = O(2n)

Creating Algorithms is the Challenge of Computer Science

  • A salesperson wants to visit 25 cities while minimizing the total number of miles driven, visiting each city exactly once, and returning home again. Which route is best?
  • It’s not easy; try this one:
  • The Traveling Salesperson problem:

Simplify the Problem to get an intuition about it

  • A B
  • C D
  • four cities connected by roads
  • Q: Starting at A, what’s the shortest route that meets the requirements?
  • A: Obvious to anyone who looks at the entire map. Not so obvious to an
  • algorithm that “sees”, at any one time, only one city and its roads
  • One algorithm to answer the question:
  • 1. generate all possible routes of length 5
  • 2. check each path to determine whether it meets the requirement
  • start at A,
  • visit B, C, and D in some order,
  • then return to A
  • How much time does this algorithm require?

All Paths from A of length 5

  • A
  • A D A D A D A D A D A D A D A D
  • B C B C B C B C
  • A D A D
  • B C
  • A B
  • C D
  • Number of paths to generate and check is 24 = 16.

Can you Improve the Algorithm?

  • Prune bad routes as soon as possible. What’s a “bad route?”
  • Look for good solutions, even if they’re sub-optimal. What’s a “good solution?”

This gets real bad, real fast!

  • In general, the algorithm’s time requirement is: (the number of roads in&out of a city) number of cities
  • Assuming the number of roads is only 2, the time requirement is 2number of cities , given by the powers of 2 table.
  • Assuming a computer could evaluate 10 million routes per second, finding the best route for 25 cities would require about 3.5 seconds. No problem!
  • However, finding the best route for 64 cities would require about seconds, or hours!
  • 20 million
  • 5000

Comparing the time requirements

  • 0 5 10 15
  • n
  • work
  • 35
  • 30
  • 25
  • 20
  • 15
  • 10
  • 5
  • 0
  • n
  • log2n
  • n2
  • 2n
  • order 10 50 100 1,000
  • log2n .0003 .0006 .0007 .001
  • n .0001 .005 .01 .1
  • n2 .01 .25 1 1.67 min
  • 2n .1024 3570 4x1016 forget it!
  • years centuries
  • time requirements for algorithms
  • of various orders of magnitude.
  • Time is in seconds, unless stated otherwise
  • n

Conclusions

  • Algorithms are the key to creating computing solutions.
  • The design of an algorithm can make the difference between useful and impractical.
  • Algorithms often have a trade-off between space (memory) and time.


The database is protected by copyright ©sckool.org 2016
send message

    Main page