Understanding Memoization

What is memoization?

From our good friend wikipedia:

In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.

Practical Example

Imagine if your web application had a has_permission method on your user model that was very expensive to calculate. Because of memoization, if that method is called multiple times, we can cache the result. This would mean that the expensive calculation would only be done once. Every subsequent call to the method would just return the cached value. In rails this would look like:

class User < ActiveRecord::Base
  def has_permission
    @has_permission ||= expensive_calculation
  end
end

Fibonacci Example

Another way to understand this concept is through the classic fibonacci problem.

This problem can be solved recursively:

def fib(num)
  return 0 if num == 0
  return 1 if num == 1
  fib(num - 1) + fib(num - 2)
end

Essentially for every fibonacci number we want to calculate, we find out the values of the previous two fibonacci number’s and sum them up. If we map out what happens with something like fib(4), we will notice that we have have a lot of duplicated calculations.

                            fib(4)
                   fib(3)            fib(2)
             fib(2)    fib(1)   fib(1)   fib(0)
        fib(1)   

So fib(1) is calculated 3 times, and fib(2) is calculated twice. That may not seem like a lot, but imagine if we started with fib(100). (To get an idea, for fib(10), fib(1) is calculated 55 times)

In programming we always want to do as little work as possible. If we calculate a value once, it would be nice if we didn’t have to calculate it again. This is the whole concept of memoization. Before returning fib(num -1) + fib(num -2) we want to check if fib(num) has been calculated. If it has been calculated, great, we just return it. If not, we calculate and then save it for future use. Here is a code example:

def fib(num, memo)
  return 0 if num == 0
  return 1 if num == 1
  return memo[num] if memo[num] # return the value if we have already 
                                # calculated memo[num]
  result = fib(num - 1, memo) + fib(num - 2, memo)
  memo[num] = result # store the result!
  result
end

Because of memoization, every fib(num) is only ever calculated once improving the performance of our code.

About Software Engineering

What is Software Engineering?

  • The word software engineering originated in 1960 at NATO conference
  • Software Engineering is not so much about code but about the methodologies taken to deliver a well developed product.

What are some methodologies use in SE?

  • Waterfall – Introduced by Winston Royce in 1970. First one to introduce the 5 Software Development Life Cycle (SDLC) method which are:
    1. Specification: Written executive summary with a final proposal plan.
    2. Design: Includes UML diagrams, classes and methods, no code!
    3. Implementation: Where coding takes place.
    4. Testing: Verification and validation phase, alpha and beta testing.
    5. Maintenance: Feedback.
  • Component Based Software Engineering (CBSE) – No need to reinvent the wheel, components are already available to start building or enhance the product (i.e. HelpScout, open source).
  • Clean Room Engineering (CRE) – An IBM software methodology which promises “zero defect software”, but is software ever bug free? There are a few requirements for CRE which are:
    1. Mathematical proofs of correctness for every line of code: before starting a project, the lead of the project must prove his approach mathematically.
    2. Programming in groups: IBM believes that programmers should work in groups not individually. This is because problems can be solved quicker in groups and there is a better learning opportunity.
    3. Statistical linguistic tool for testing: stress test the system to make sure everything is to specifications.
  • Spiral / Agile – encourages interaction between teams (product team with development team), engage discussion with funding agencies to enforce more communication. https://agilemanifesto.org/
  • Kanban – Originated in 1980s by Japanese car manufactories . Kanban promotes continuous collaboration and encourages active, ongoing learning and improving by defining the best possible team workflow. It is based on three key principles:
    1. Visualize what you do today (workflow): seeing all the items in context of each other can be very informative
    2. Limit the amount of work in progress (WIP): this helps balance the flow-based approach so teams don€™t start and commit to too much work at once
    3. Enhance flow: when something is finished, the next highest thing from the backlog is pulled into play

Parameterized GraphQL Fields!?!?!

Do you love GraphQL, but lament the inability to query values that depend on passed-in parameters? Great news, there is a solution: Parameterized GraphQL Fields!

Consider a query which returns some fun things about the user. One of which is their favorites! Favorites can return different categories of a user’s favorite things. For example you could pass in ‘books’ as a category, and get a list of the user’s favorite books. Or ‘cuisines’ to get a list of the user’s favorite… well you see where this is going!

Here’s the frontend code:

query user($user_id: ID, $category?: String) {
 user(user_id: $user_id) {
    id
    full_name
    favorites(category: $category)
  }
}

And the backend code, using graphql-ruby.

field :favorites, FavoriteType, 'Favorite things belonging to the user' do
argument :category, types.String
    resolve ->(user, args, ctx) {
      return user.favorites.where(category: args[:category])
    }
end

It’s just that easy—you really can have it all!

Update multiple records at once with Active Record

Have you ever found that you needed to update more than one record at a time? Rails has an elegant solution for this:

update_all

Updates all records in the current relation with details given. This method constructs a single SQL UPDATE statement and sends it straight to the database. It does not instantiate the involved models and it does not trigger Active Record callbacks or validations. However, values passed to update_all will still go through Active Record’s normal type casting and serialization.

Say you are working with a library book tracking system. One user has returned three books today.

# user with id 100 checked out books:
  User.find(100).books.checked_out
> [
   #<Book id:1, title: 'The Hobbit', status: 'checked_out', user_id: 100>, 
   #<Book id:2, title: 'Children of Time', status: 'checked_out', user_id: 100>,
   #<Book id:3, title: 'The Count of Monte Cristo', status: 'checked_out', user_id: 100>
  ]

You could update each record individually using the commonly used update active record function, however rails also provides the update_all function:

checked_out_books = User.find(100).books.checked_out
checked_out_books.update_all(status: 'returned')

Which is great! We have successfully marked the books in the system as returned with one call to the database (instead of 3)!

Some things to consider when using this method:

  • this method does not trigger model callbacks
  • this method does not update the updated_at field automatically. However you can manually pass the current DateTime to updated_at.

Protocol Associated Type

Protocol associated type is a very useful tool to add to your Swift toolkit. In order to fully understand and appreciate it, we must briefly touch on how a normal protocol works.

As you can see below, the Events protocol declares a string type for the events array.

protocol Events {
  var events: [String] { get set }
}

Therefore, the conforming type MeetupEvents struct can simply implement the events array as long as it uses the type declared in the protocol. What if you want more flexibility over the type and perhaps have the conforming types decide the type later on? This is where Protocol Associated Type comes in!

struct MeetupEvents: Events {
  var events = ["iOSoho", "Brooklyn Swift"]
}

You can think of a protocol associated type as a generic placeholder for declaring protocols in Swift without specifying the type upfront. The conforming type would declare an array of whatever type to fill EventType later on and implement the add function.

protocol Events {
  associatedtype EventType

  var events: [EventType] { get set }
  mutating func add(event: EventType)
}

Adding an extension to this protocol allows you to create a default implementation of the add function.

extension Events {
   mutating func add(event: EventType) {
     events.append(event)
   }
}

We create a struct to conform to Events and Swift connects the dots as the expected array could be of any type.

struct MeetupEvents: Events {
    var events = [String]()  
}

Lastly, we want to implement the add function after creating an instance of MeetupEvents in order to fully conform to the protocol.

var meetupEvents = MeetupEvents()
meetupEvents.add(event: "iOSoho")
meetupEvents.add(event: "Brooklyn Swift") 

Et voilà ! Now you can declare protocols without constraining it to a type upfront and give conforming types more flexibility. What do you think? Let me know in the comments below.

Benchmarking ruby string interpolation vs. concatenation

Ever wondered if there was a significant difference in performance between string interpolation vs. concatenation in ruby? Me too! So I put together this simple benchmark:

concatenation.rb

t = Time.now
a = []
10_000_000.times do
  a << "a" + " " + "b"
end
puts Time.now.to_f - t.to_f

interpolation.rb

t = Time.now
a = []
10_000_000.times do
  a << "#{"a"} #{"b"}"
end
puts Time.now.to_f - t.to_f

Results

➔ ruby interpolation.rb
1.5596427917480469
➔ ruby concatenation.rb
7.423596143722534

So there you have it. Interpolation is about 4.5x faster.

What do you think about my testing methodology? Is there a better way? Let me know!