Julian’s Notes

Developer Productivity

2018-01-03T17:13:05+00:00

What is Developer Productivity?

Productivity is defined as “the effectiveness of productive effort, especially in industry, as measured in terms of the rate of output per unit of input.” Therefore, developer Productivity can be described as a concept, set of tools or processes, or a team that is dedicated to enhancing the efficiency of other developers with the goal of allowing them to increase their overall output.

This concept and the responsibilities of the team in charge of developer productivity can be varied throughout different companies, however after a number of meetings it became clear that the teams mostly focus on a few key areas. Unfortunately, the developers at different companies don’t seem to discuss, have meet-ups, or work together on allied goals.

Developer productivity is not a business concern nor is most of the work confidential. The work is generally generic and the concepts are easily shared and discussed. Over the course of many meetings with similar teams at other companies, it became obvious that we’re all duplicating each others’ work and more importantly we’re duplicating the exploration.

Vision

Developer Productivity is not a business concern and it is not specific to any company. The concepts are shareable and can help reduce time to explore, gain more feedback and expertise, and allow us to help define standards in the industry.

If we work together in creating a community and a set of community guidelines - we will all benefit.

Phase I

Phase I is simply the initial foundation of the community. The Slack channel, invitations at https://chat.devproductivity.io, is a good first step towards an initial foundation.

As we grow the audience, we can start to hold online meet-ups. The first one is currently scheduled for sometime in January 2018.

These simple ideas will allow us to start to grow a community and provide the foundations to introduce work on additional phases of the project.

Success Criteria: The community starts to grow and people remain excited. There is a decent turnout for the online meet-ups and people are asking what comes next.

Phase II

There is no current ground work laid to bootstrap a community dedicated to developer productivity. This means that we have the opportunity to create this community and share the work we do with each other.

https://devproductivity.io will be a central hub managed by the community. With links to content about continuous integration and testing, automation, operational excellence, developer environments, mobile tooling, and other aspects of developer productivity - this website will be a central location on which to grow a solid foundation.

This website/community allows us to draw talent and work from the larger pool of engineers working on the same goals, it also allows us to share our ideas, and educate the larger developer community about standard goals.

This website is envisioned to be a handbook or styleguide for all people working on developer productivity. With a website similar to https://fastlane.tools or https://polaris.shopify.com, we will gain the ability to communicate industry best practices and crowd source their definitions. Moreover, we can expand the website to include blogs (written by the community members!), useful papers and links, forums, and other useful materials.

Success Criteria: The community gets excited about the website and we start to see traffic pick up. The community will start to become more involved and take ownership of parts of the content once the initial site is launched.

Phase III

As we grow, online meet-ups will likely not scale and people will want to have more meaningful in person collaborations and discussions. The website will need more community to continue to scale and continue to be a place that people frequent.

In Phase III, we hold our first conference. Development has had conferences of so many varieties, but there has never been a conference dedicated to the concepts that form developer productivity. That being said some conferences, such as Velocity and DevDaysTO, dedicate portions of their content towards developer productivity but no known conference is targetted specifically towards developer productivity.

By holding the conference, we aim to become a driving force behind Developer Productivity and increase community knowledge and commitment to this community.

Success Criteria: We see excitement and participation. The website and the content continues to increase in traffic and a vibrant community is formed.

Phase N

We may not need to go much further than a conference. If these phases are successful, we may have found a recipe to continue to grow and continue to be a power house in the developer productivity community.

We also don’t have all of the ideas and don’t want to be prescriptive about how the community grows. We want to see how it naturally grows and in what direction the community wants to grow.

Wakame (Seaweed)

2017-04-05T23:24:08+00:00

A staple in Japanese cuisine, Wakame (ワカメ or Undaria pinnatifida) is a sea vegetable/edible seaweed.

A subtly sweet flavour that is ripe with umami. It is usually very salty too. It has a satiny texture.

The leaves expand during cooking, so cut the pieces up with that in mind.

Uses

Wakame Salad (seaweed salad)
Toppings for sandwiches, meat dishes, etc
Soups
Side dish

Minerals and Nutrients

Watch out for

High in sodium

Good for

Wakame is low in calories and is a great source of vitamins and minerals. It includes:

iodine
iron
calcium
magnesium
folate
vitamin A
vitamin C
vitamin D
vitamin E
vitamin K
vitamin B2
lignans
fucoxanthin
eicosapentaenoic acid, an omega-3 fatty acid

Nutritional Information

Values are per 100g

Overview

	Amount
Calories	45
Carbohydrates	9.14g
Sugars	0.65g
Dietary Fiber	0.5g
Fat	0.64g
Protein	3.03g

Vitamins

Vitamin	Percent	Amount
Thiamine (B1)	5%	0.06 mg
Riboflavin (B2)	19%	0.23 mg
Niacin (B3)	11%	1.6 mg
Pantothenic acid (B5)	14%	0.697 mg
Folate (B9)	49%	196 μg
Vitamin C	4%	3 mg
Vitamin E	7%	1 mg
Vitamin K	5%	5.3 μg

Minerals

Mineral	Percent	Amount
Calcium	15%	150 mg
Iron	17%	2.18 mg
Magnesium	30%	107 mg
Manganese	67%	1.4 mg
Phosphorus	11%	80 mg
Sodium	58%	872 mg
Zinc	4%	0.38 mg

umami

2017-04-05T23:24:08+00:00

Umami

(/uˈmɑːmi/)

Umami is also known as the “savoury taste” and is one of the five basic tastes. It is described as brothy or meaty.

People taste umami using taste receptors for glutamate (hence why monosodium glutamate [MSG] is essentially pure umami).

WIP

bundler/setup

2017-04-05T23:24:08+00:00

Bundler setup parses through dependencies and compiles them into proper load paths. This step, on smaller applications, takes very little time. However on larger applications, this step can take a long duration - about 700-750ms to be exact.

Below are notes about how long certain parts take.

Timing Helper

Throughtout these notes, I am using a method _t. This is a timing helper for scrappy timing defined as such:

    def _t(label)
      t = Process.clock_gettime(Process::CLOCK_MONOTONIC)
      ret = yield
      puts "#{label} #{Process.clock_gettime(Process::CLOCK_MONOTONIC) - t}"
      ret
    end

The key thing to note is that it uses CPU time and the return value is whatever it is from the yield. The latter point makes it easy to track things down.

Highest Level

If we open the bundler/setup.rb file up, we might notice that it is small enough to simply benchmark each line. Doing this results in the following sequence diagram:

We can take note that Bundler.setup results in almost the entire duration of the call to require 'bundler/setup'. Let’s dig into that more.

Bundler.setup

The call to Bundler.setup is a little bit ambiguous due to parameters, but checking the source_location at runtime results in setup at line 90 of lib/bundler.rb. This was what I originally thought, but it it good to check.

Bundler.method(:setup).source_location
["/Users/juliannadeau/.gem/ruby/2.3.3/gems/bundler-1.14.5/lib/bundler.rb", 90]

The method definition here is as follows:

return @setup if defined?(@setup) && @setup

definition.validate_runtime!

SharedHelpers.print_major_deprecations!

if groups.empty?
  # Load all groups, but only once
  @setup = load.setup
else
  load.setup(*groups)
end

We can see that it caches the orginal result on the Bundler class and so we can only call it once per run. This is good as it will save a lot of time if we happen to call it twice.

A few questions I have up front:

is definition a variable or a method? Given that this is the first call to a class, it’s probably a method.
groups is almost definitely empty. It is probably a method too. Is it cached?
same thing with load

The reason this is important is that while the method calls on the return values of the methods mentioned above should be traced, we need to make sure that the method calls to get those return values aren’t slow either. To do this, we will need to split up the variable/method calls.

We end up with this:

return @setup if defined?(@setup) && @setup

d = _t('definition') do
  definition
end

_t('validate_runtime!') do
  d.validate_runtime!
end

_t('print_major_deprecations') do
  SharedHelpers.print_major_deprecations!
end

g = _t('groups') do
  groups
end

l = _t('load') do
  load
end

if g.empty?
  # Load all groups, but only once
  @setup = _t('setup 1') do
    l.setup
  end
else
  _t('setup 2') { l.setup(*groups) }
end

Results of timing

It is painfully obvious that we spend a lot of time in 2 spots. About 1/3 of the time is spent in definition, and the other 2/3 is spent in load.setup (specifically the setup call). We’ll dig into both of these separately.

To continue this path:

RubyGems

2017-04-05T23:24:08+00:00

Specification

Rails Autoloading

2017-04-05T23:24:08+00:00

Autoloading code is a mechanism in Rails that causes frameworks, classes, and code to be loaded automatically on boot. This helps productivity by allowing developers to freely use constants and classes without having to explicitly require them.

An issue arises however that large amounts of code that are not needed for boot are loaded during the boot of an application, or are loaded out of order.

The diagram below shows how files and classes are autoloaded.

Problem

Load order dependency issues can happen due to nested class defintions.

In the code snippet below, class A defines a class B. This means that the constant B is now defined. In the diagram above, we see that the un-nested class B depends on the ConstantMissing error to load it during auto-load. However, since A::B is defined, a ConstantMissing hook will never happen as B will resolve to A::B.

class A
   class B
   end
end

class B
end

In particular, from the diagram above, this part never happens.

Caching Paths

2017-04-05T23:24:08+00:00

The Path Scanner is intended to identify all files and folders within a given path that are not in the bundler path already. As a result, we can then use this result to cache path loading.

** If the bundle path is a descendent of this path, we do additional checks to prevent recursing into the bundle path as we recurse through this path. We don’t want to scan the bundle path because anything useful in

Moving Average Convergence Divergence - MACD

2017-04-05T23:24:08+00:00

It is a trend following momentum indicator showing the relationship between 2 moving averages.

The MACD is calculated by:

subtracting the 26-day exponential moving average (EMA) from the 12-day EMA.
A 9-day EMA of the MACD is plotted on top of this.
It is used as a signal line to indicate when to buy and sell.

Interpretation

Crossovers

When the MACD falls below the signal line, it is a “bearish” signal which indicates that it may be time to sell.

Conversely, when it rises above the signal line, it may indicate an upward momentum.

Divergence

When the price diverges from the MACD, it means the end of the current trend.

A Dramatic Rise

When the shorter term (9 day EMA) pulls away from the longer term (26 day EMA) it means that the stock is overbought and will soon return to normal levels.

Other

When the line moves above or below the zero line, this is a signal the position of the short term average relative to the long term average.

When it is above zero, the short term average is above the long term average. This indicates upward momentum. When it is below zero, it indicates downward momentum.

Logical Clocks

2017-04-05T23:24:08+00:00

This is a great presentation.

Logical clocks are used to agree on order in which events occur. The absolute/real time is not important in this concept.

Event ordering can be based on any number of factors. In a local system, CPU time can be used. But in a distributed system, there is no perfectly synchronized time or clock that can be used, and local times may not be in sync (and probably are not). Lamport suggested a logical clock be used to address this.

Key concepts

Processes exchange messages
Message must be sent before received
Send/receive used to order events and synchronize logical clocks

Properties

If A happens before B in the same process (or system), then A -> B
A -> B also means that A sent the message and B means the receipt of it
Relation is transitive: e.g A -> B and B -> C implies A -> C
Unordered events are concurrent: A !-> B and B !-> A implies A || B

Lamport’s Logical Clocks

If A -> B then timestamp(A) < timestamp(B)

Lamport’s Algorithm

Note: A -> B implies L(A) < L(B), but L(A) < L(B) does not necessarily imply A -> B. In other words, A -> B implies that the logical clock of A is less than that of B, but the logical clock of A being less than that of B does not imply that A -> B.

Totally Ordered Multicast

Example: We have a large distributed database. We need to make sure that replications are seen in the same order in all replicas. This requires us to cast the replicas to all systems in an absolutely total order.

Example Situation: The following events occur:

A) We have $1000 in a bank account.
B) We add $100
C) We calculate 1% interest on the balance.

If the order is ABC, then the 1% interest will be $1100 * 0.01 = $11. But if the order is ACB, then the interest will be $1000 * 0.01 = $10. In this case, the order matters as the interest is different.

Lamport’s logical clocks can be applied to implement a totally‐ordered multicast in a distributed system.

Implementation

Assumptions:

No messages are lost
Messages from the same sender are received in the same order as they were sent

Process P(i) will send out a message M(i) to all others with timestamp T(i). An incoming message is queued according to it’s timestamp. P(i) will pass a message to its own application if it meets 2 criteria: the message is at the head of the queue, the message has been acked by all other processes.

All processes will end up with the same messages with the same timestamps, so order can be sorted out locally and therefore all messages are delivered in the same order.

bundler/lockfile_parser.rb

2017-04-05T23:24:08+00:00

Here, we see that parse_#{@state} is the bulk of the work. This is a dynamic call to parse methods… is any one of them slower than another?

To solve this, I split out the dynamic line into a case statement to see which lines were being called.

elseif @state
+ case @state.to_s
+ when 'source'
+     parse_source(line)
+ when 'dependency'
+     parse_dependency(line)
+ when 'spec'
+     parse_spec(line)
+ when 'platform'
+     parse_platform(line)
+ when 'bundled_with'
+     parse_bundled_with(line)
+ when 'ruby'
+     parse_ruby(line)
+ else
+     send("parse_#{@state}", line)
+ end
- send("parse_#{@state}", line)   
end

By the diagram below, we can see the following from our case statement:

parse_state	number	time
parse_source	1131 times	32ms	`SOURCE` did not include line, so it went to the case statement
parse_platform	1 time	1 ms	-
parse_dependency	237 times	15 ms	-
parse_bundled_with	1 time	1 ms	-

parse_source

parse_spec is the obvious bulk of this method, so let’s also look there.

parse_spec

The parse spec code looks like so:

def parse_spec(line)
  if line =~ NAME_VERSION_4
    name = $1
    version = $2
    platform = $3
    version = Gem::Version.new(version)
    platform = platform ? Gem::Platform.new(platform) : Gem::Platform::RUBY
    @current_spec = LazySpecification.new(name, version, platform)
    @current_spec.source = @current_source

    # Avoid introducing multiple copies of the same spec (caused by
    # duplicate GIT sections)
    @specs[@current_spec.identifier] ||= @current_spec
  elsif line =~ NAME_VERSION_6
    name = $1
    version = $2
    version = version.split(",").map(&:strip) if version
    dep = Gem::Dependency.new(name, version)
    @current_spec.dependencies << dep
  end
end

It takes about 15-17ms to run all of it. I’d like to see how often each part is called.

NAME_VERSION_4, called 374 times, took about 7ms
NAME_VERSION_6, called 480 times, took about 8ms

Which means they take equally as long, but the NAME_VERSION_4 option is slower taking about 0.000044s for each run as opposed to 0.000035s for each run.

So, what is the difference between these two? Well NAME_VERSION_4 is a top level dependency, whereas NAME_VERSION_6 is a sub-dependency, it seems.

 NAME VERSION 4     web-console (3.4.0)
 NAME VERSION 6       actionview (>= 5.0)
 NAME VERSION 6       activemodel (>= 5.0)
 NAME VERSION 6       debug_inspector
 NAME VERSION 6       railties (>= 5.0)
 NAME VERSION 4     webmock (2.3.2)
 NAME VERSION 6       addressable (>= 2.3.6)
 NAME VERSION 6       crack (>= 0.3.2)
 NAME VERSION 6       hashdiff

So what does this actually do? Seems it resolves specifications from the lockfile. The “4 space” (NAME VERSION 4) seems to also load a current spec, which I don’t quite get. Seems we re-assign this class level variable a lot to avoid passing it around.

We can see that "dep = GemDependency.new(name version) (run 480 times)" :a1, 0.021, 0.035 takes a chunk of time (14ms with gantt generation, 6ms in reality), otherwise there’s not much bulk here.

So, in the end the reason this file is slower is that it is iterating over many sources and creating Gem::Dependency objects. There is likely something we could do to make LockFileParser faster, but the work likely won’t be worth the time spent.

There isn’t much we can do to make this file faster without caching using marshalling the data or something.