# Algorithms and Analysis cs 2308 Foundations of cs II algorithms

 Date conversion 25.04.2017 Size 23,31 Kb.

## Algorithms and Analysis

• CS 2308 Foundations of CS II

## Algorithms

• Step-by-step instructions that tell a computing agent how to solve some problem using only finite resources
• Resources
• Memory
• CPU cycles
• Time/Space
• Types of instructions
• Sequential
• Conditional
• If statements
• Iterative
• Loops

## Pseudocode: The Interlingua for Algorithms

• an English-like description of the sequential, conditional, and iterative operations of an algorithm
• no rigid syntax. As with an essay, clarity and organization are key. So is completeness.

## Pseudocode Example

• Pseudocode Example
• Find Largest Number
• Input: A list of positive numbers
• Output: The largest number in the list
• Procedure:
• 1. Set Largest to zero
• 2. Set Current-Number to the first in the list
• 3. While there are more numbers in the list
• 3.1 if (the Current-Number > Largest) then
• 3.1.1 Set Largest to the Current-Number
• 3.2 Set Current-Number to the next one in the list
• 4. Output Largest
• conditional
• operation
• Pseudocode Example
• Find Largest Number
• Input: A list of positive numbers
• Output: The largest number in the list
• Procedure:
• 1. Set Largest to zero
• 2. Set Current-Number to the first in the list
• 3. While there are more numbers in the list
• 3.1 if (the Current-Number > Largest) then
• 3.1.1 Set Largest to the Current-Number
• 3.2 Set Current-Number to the next one in the list
• 4. Output Largest
• iterative
• operation
• Pseudocode Example
• Find Largest Number
• Input: A list of positive numbers
• Output: The largest number in the list
• Procedure:
• 1. Set Largest to zero
• 2. Set Current-Number to the first in the list
• 3. While there are more numbers in the list
• 3.1 if (the Current-Number > Largest) then
• 3.1.1 Set Largest to the Current-Number
• 3.2 Set Current-Number to the next one in the list
• 4. Output Largest
• Let’s “play computer” to review this algorithm…

## Algorithms vary in efficiency

• example: sum the numbers from 1 to n
• space requirement is constant (i.e. independent of n)
• time requirement is linear (i.e. grows linearly with n). This is written “O(n)”
• efficiency
• space= 3 memory cells
• time = t(step1) +
• t(step 2) +
• n t(step 4) +
• n t(step 5)
• 1. Set sum to 0
• 2. Set currNum to 1
• 3. Repeat until currNum > n
• 4. Set sum to sum + currNum
• 5. Set currNum to currNum + 1
• Algorithm I
• to see this graphically...

## Algorithm Is time requirements

• 0 1 2 … n
• size of the problem
• time
• y = mx + b
• time = m n + b
• The exact equation for the line is unknown
• because we lack precise values for the constants m and b.
• But, we can say:
• time is a linear function of the size of the problem
• time = O(n)

## Algorithm II for summation

• The “key insight”, due to Gauss:
• the numbers can be grouped into
• 50 pairs of the form:
• 1 + 100 = 101
• 2 + 99 = 101
• . . .
• 50 + 51 = 101
• First, consider a specific case: n = 100.
• }
• sum = 50 x 101
• This algorithm
• requires a single
• multiplication!
• Second, generalize the formula for any (even) n :
• sum = (n / 2) (n + 1)
• Time requirement is constant.
• time = O(1)

## Sequential Search: A Commonly used Algorithm

• Suppose you want to a find a student in the Texas State directory.
• It contains EIDs, names, phone numbers, lots of other information.
• You want a computer application for searching the directory: given an EID, return the student’s phone number.
• You want more, too, but this is a good start…

## Sequential Search of a student database

• algorithm
• to search database
• by EID :
• 1. ask user to enter EID to search for
• 2. set i to 1
• 3. set found to ‘no’
• 4. while i <= n and found = ‘no’ do
• if EID = EIDi then set found to ‘yes’
• else increment i by 1
• 7. if found = ‘no’ then print “no such student”
• else < student found at array index i >
• .
• .
• .
• .
• .
• .
• name EID major credit hrs.
• John Smith
• Paula Jones
• Led Belly
• Chuck Bin
• JS456
• PJ123
• LEB900
• CB1235
• .
• .
• .
• .
• .
• .
• physics
• history
• music
• math
• 36
• 125
• 72
• 89
• 1
• 2
• n
• 3

## Time requirements for sequential search

• average case (expected amount of work):
• EID found in studentn/2
• n/2 loop iterations
• because the amount of work is a constant multiple of n,
• the time requirement is O(n)
• in the worst case and the average case.
• amount
• of work
• best case (minimum amount of work):
• EID found in student1
• one loop iteration
• worst case (maximum amount of work):
• EID found in studentn
• n loop iterations

## O(n) searching is too slow

• Consider searching Texas State’s student database using
• sequential search on a computer capable of
• 20,000 integer comparisons per second:
• n = 150,000 (students registered during past 10 years)
• average case
• 150,000 comparisons 1 seconds = 3.75 seconds
• 2 20,000 comparisons
• x
• worst case
• 150,000 comparisons x 1 seconds = 7.5 seconds
• 20,000 comparisons
• Bad news for searching NYC phone book, IRS database, ...

## Searching an ordered list is faster: an example of binary search

• .
• .
• .
• .
• .
• .
• name student major credit hrs.
• John Smith
• Paula Jones
• Led Belly
• Chuck Bin
• 24576
• 36794
• 42356
• 93687
• .
• .
• .
• .
• .
• .
• physics
• history
• music
• math
• 36
• 125
• 72
• 89
• 1
• 2
• n
• 3
• student
• 24576
• 36794
• 38453
• 41200
• 43756
• 45987
• 47865
• 49277
• 51243
• 58925
• 59845
• 60011
• 60367
• 64596
• 86756
• 93687
• 1
• 2
• 3
• 4
• 5
• 6
• 7
• 8
• 9
• 10
• 11
• 12
• 13
• 14
• 15
• 16
• note: the
• student array
• is sorted in
• increasing
• order
• how would you
• search for 58925 ?
• 38453 ?
• 46589 ?
• Probe 1
• Probe 2
• Probe 3

## The binary search algorithm

• 1. ask user to input studentNum to search for
• 2. set found to ‘no’
• 3. while not done searching and found = ‘no’
• 4. set middle to the index counter at the middle of the student list
• 5. if studentNum = studentmiddle then set found to ‘yes’
• 6. if studentNum < studentmiddle then chop off the last half
• of the student list
• 7. If studentNum > studentmiddle then chop off the first half
• of the student list
• 8. if found = ‘no’ then print “no such student”
• else
• How?
• How?
• What does this mean?

## The binary search algorithm

• assuming that the entries in student are sorted in increasing order,
• 1. ask user to input studentNum to search for
• 2. set beginning to 1
• 3. set end to n
• 4. set found to ‘no’
• 5. while beginning <= end and found = ‘no’
• 6. set middle to (beginning + end) / 2 {round down to nearest integer}
• 7. if studentNum = studentmiddle then set found to ‘yes’
• 8. if studentNum < studentmiddle then set end to middle - 1
• 9. if studentNum > studentmiddle then set beginning to middle + 1
• 10.if found = ‘no’ then print “no such student”
• else

## Time requirements for binary search

• At each iteration of the loop, the algorithm cuts the list
• (i.e. the list called student) in half.
• In the worst case (i.e. when studentNum is not in the list called student)
• how many times will this happen?
• n = 16
• 1st iteration 16/2 = 8
• 2nd iteration 8/2 = 4
• 3rd iteration 4/2 = 2
• 4th iteration 2/2 = 1
• the number of times a number n
• can be cut in half and not go below 1
• is log2 n.
• Said another way:
• log2 n = m is equivalent to 2m = n

## This is a major improvement

• n sequential search binary search
• O(n) O(log2 n)
• 100 100 7
• 150,000 150,000 18
• 20,000,000 20,000,000 25
• 27=128
• 218=262,144
• 33,000,000
• number of comparisons
• needed in the worst case
• 150,000 comparisons x 1 second = 7.5 seconds
• 20,000 comparisons
• in terms of seconds...
• 18 comparisons x 1 second > .001 seconds
• 20,000 comparisons
• sequential search:
• binary search:
• vs.

## Sorting a list

• First, an algorithm that’s easy to write, but is badly inefficient...
• unsorted
• sorted
• 35467
• 67854
• 46781
• 13528
• 87341
• 1
• 2
• 3
• 4
• 5
• initially
• unsorted
• sorted
• 35467
• 67854
• 46781
• 13528
• 87341
• 1
• 2
• 3
• 4
• 5
• 13528
• 1st iteration
• unsorted
• sorted
• 35467
• 67854
• 46781
• 13528
• 87341
• 1
• 2
• 3
• 4
• 5
• 13528
• 35467
• 2nd iteration
• unsorted
• sorted
• 35467
• 67854
• 46781
• 13528
• 87341
• 1
• 2
• 3
• 4
• 5
• 13528
• 35467
• 46781
• etc.
• 3rd iteration

## The “Simple Sort” Algorithm

• 1. set i to 1
• 2. repeat until i > n
• set indexForSmallest to the index of the smallest
• positive value in unsorted
• 4. set sortedi to unsortedindexForSmallest
• 5. set unsortedindexForSmallest to 0
• 6. increment i
• given a list of positive numbers, unsorted1, …, unsortedn and
• another list, sorted1, …, sortedn, with all values initially set to zero

## This algorithm is expensive!

• Time requirement
• total time = n iterations x time per iteration
• time per iteration = time to find smallest value in a list of length n
• = O(n)
• total time = n x O(n)
• = O(n2)
• Space requirement
• total space = space for unsorted + space for sorted
• = O(2n)

## Creating Algorithms is the Challenge of Computer Science

• A salesperson wants to visit 25 cities while minimizing the total number of miles driven, visiting each city exactly once, and returning home again. Which route is best?
• It’s not easy; try this one:
• The Traveling Salesperson problem:

## Simplify the Problem to get an intuition about it

• A B
• C D
• four cities connected by roads
• Q: Starting at A, what’s the shortest route that meets the requirements?
• A: Obvious to anyone who looks at the entire map. Not so obvious to an
• algorithm that “sees”, at any one time, only one city and its roads
• One algorithm to answer the question:
• 1. generate all possible routes of length 5
• 2. check each path to determine whether it meets the requirement
• start at A,
• visit B, C, and D in some order,
• How much time does this algorithm require?

## All Paths from A of length 5

• A
• A D A D A D A D A D A D A D A D
• B C B C B C B C
• A D A D
• B C
• A B
• C D
• Number of paths to generate and check is 24 = 16.

## Can you Improve the Algorithm?

• Prune bad routes as soon as possible. What’s a “bad route?”
• Look for good solutions, even if they’re sub-optimal. What’s a “good solution?”

## This gets real bad, real fast!

• In general, the algorithm’s time requirement is: (the number of roads in&out of a city) number of cities
• Assuming the number of roads is only 2, the time requirement is 2number of cities , given by the powers of 2 table.
• Assuming a computer could evaluate 10 million routes per second, finding the best route for 25 cities would require about 3.5 seconds. No problem!
• However, finding the best route for 64 cities would require about seconds, or hours!
• 20 million
• 5000

## Comparing the time requirements

• 0 5 10 15
• n
• work
• 35
• 30
• 25
• 20
• 15
• 10
• 5
• 0
• n
• log2n
• n2
• 2n
• order 10 50 100 1,000
• log2n .0003 .0006 .0007 .001
• n .0001 .005 .01 .1
• n2 .01 .25 1 1.67 min
• 2n .1024 3570 4x1016 forget it!
• years centuries
• time requirements for algorithms
• of various orders of magnitude.
• Time is in seconds, unless stated otherwise
• n

## Conclusions

• Algorithms are the key to creating computing solutions.
• The design of an algorithm can make the difference between useful and impractical.
• Algorithms often have a trade-off between space (memory) and time.