load_breakdown

Posted on 2017-04-05

Reading time 0

Load Breakdown

load

Posted on 2017-04-05

Reading time 6

initialize A quick look at load.setup shows us that the load method takes a small amount of time 0.0016739999991841614s. This means the bulk of the time is spent in setup. setup This method took about 0.6628200000268407s to run. As we can see, specs = groups.any? ? @definition.specs_for(groups) : requested_specs takes the most time (about 85% of the time). Let’s break that down a bit. I’ll just change the turnary to an if/else and see what that produces. As we can see, @definition.specs_for(groups) is not even called. All the time is spent in requested_specs. requested_specs It seems this delegates to definition. In definition, this is the result: Let’s look at specs_for specs_for specs.for(expand_dependencies(deps)) takes the most time, but is it the specs.for part, or the expand_dependencies part? It is the specs.for part: specs.for specs As we can see, about 3/4 of the time is spent making the specs, and 1/4 of the time processing with for. specs This line does quite a lot (resolve.materialize(Bundler.settings[:cache_all_platforms] ? dependencies : requested_dependencies)), so let’s split it up. As we can see, resolve and materialize take the most time. Materializing line num_calls time (s) resolve 73 0.05524000007426366 materialize 1 0.1665900000371039 materialize 374 0.15040999941993505 specs 374 0.13627100008307025 rubygems spec 296 0.03416500013554469 git specs 293 0.09133500000461936 search 596 0.012452999944798648 git-based specs We can see that we load 82 gemspecs - which takes the most time. Can we cache loading those gemspecs? They aren’t going to change in between loads. Globbing the filesystem also takes a chunk of time...

Experimental Rewrite

Posted on 2017-04-05

Reading time 0

MVP includes: gem support source support group support bundle install This is a very naive approach as it doesn’t really take into account resolving nested dependencies in gemspecs. The lockfile is consisted of a very simple file in the following format for easy parsing: checksum 12345abcdef gem_name gem_version gem_name gem_version gem_name gem_version gem_name gem_version

bundler/definition.rb

Posted on 2017-04-05

Reading time 10

Bundler#definition As we can see, Definition.build take a long time to process. Definition.build From here we can see Dsl.evaluate takes the most time Dsl.evaluate We can see that the time is split between eval_gemfile and to_definition. builder.eval_gemfile We can see here that when we take the contents of the bundler file, and instance_eval it, we’ll spend about 55ms doing that. Digging into the instance_eval a little more using TracePoint, we can see that there are hundreds of mini-methods called starting with dsl#source. We get this approximate trace: [161, Bundler::Dsl, :source, :call] [336, Bundler::Dsl, :normalize_hash, :call] [435, Bundler::Dsl, :normalize_source, :call] [449, Bundler::Dsl, :check_primary_source_safety, :call] [90, Bundler::SourceList, :rubygems_primary_remotes, :call] [38, Bundler::SourceList, :add_rubygems_remote, :call] [210, Bundler::Source::Rubygems, :add_remote, :call] ... [115, Bundler::SourceList, :warn_on_git_protocol, :call] [245, #<Class:Bundler>, :settings, :call] [54, Bundler::Settings, :[], :call] [224, Bundler::Settings, :key_for, :call] [325, Bundler::Dsl, :with_source, :call] [79, Bundler::Dependency, :initialize, :call] [38, Gem::Dependency, :initialize, :call] ... [54, #<Class:Gem::Requirement>, :create, :call] [123, Gem::Requirement, :initialize, :call] [121, Bundler::Dsl, :gem, :call] [347, Bundler::Dsl, :normalize_options, :call] [336, Bundler::Dsl, :normalize_hash, :call] [343, Bundler::Dsl, :valid_keys, :call] [418, Bundler::Dsl, :validate_keys, :call] [209, Bundler::Dsl, :git, :call] [336, Bundler::Dsl, :normalize_hash, :call] [24, Bundler::SourceList, :add_git_source, :call] [13, Bundler::Source::Git, :initialize, :call] [96, Bundler::SourceList, :add_source_to_list, :call] [49, Bundler::Source::Git, :hash, :call] [79, Bundler::Source::Git, :name, :call] [49, Bundler::Source::Git, :hash, :call] [79, Bundler::Source::Git, :name, :call] ... repeat the last block a lot, particularly Bundler::Source::Git calls ... [115, Bundler::SourceList, :warn_on_git_protocol, :call] [245, #<Class:Bundler>, :settings, :call] [54, Bundler::Settings, :[], :call] [224, Bundler::Settings, :key_for, :call] [325, Bundler::Dsl, :with_source, :call] [79, Bundler::Dependency, :initialize, :call] [38, Gem::Dependency, :initialize, :call] Without optimizing...

Convolutional Neural Networks

Posted on 2017-04-05

Reading time 0

CNNs are good for image recognition and classification. They also excel at natural language processing tasks. An early predecessor called LeNet was built in the late 80s and throughout the 90s. It was mainly used for OCR. For reference, this article was used. Glossary and Concepts Images are represented by a matrix of values based on their channel Channels are the values represented by a component of an image, (e.g. RGB or CMYK). A typical image has 3 components for RGB, a grayscale image has 1 component. Operations of CNNs Convolution Non Linearity (ReLU) Pooling or sub sampling Classification (fully connected sublayer) Convolution This is a WIP

Constant Lookup

Posted on 2017-04-05

Reading time 1

This it the flowchart that Ruby follows to look up a constant. The source for this flowchart was parse from the defintion for const_get from Ruby 2.1.0.

Kubernetes Configs

Posted on 2017-04-05

Reading time 3

To run a Kubernetes cluster, you can group services in namespaces. This will keep a grouping of services and deployments in separate namespaced sections. To create a namespace run kubectl create namespace <NAMESPACE> After creating a namespace, you can apply configurations. In particular you want deployments which will run the containers. Services expose those deployments within the cluster. This makes them accessible to other deployments. Below are pieces of configurations. You can combine many together and apply them all at once. For example, my website defines a number of configurations all in one yaml file. a deployment for an app server exposes the app server with a service definition a deployment and service is running for Postgres In the App server defintion, we can access the database using postgres.NAMESPACE.svc.cluster.local Finally, an ingress is defined to expose the app on a URL. Run kubectl apply -f PathToYaml.yml -n NAMESPACE to apply it. Deployment A deployment should specify a few things. Namely, it should specify the docker image you would like to run, volumes you’d like to mount, and environment variables to use in the container. --- apiVersion: extensions/v1beta1 kind: Deployment metadata: name: website namespace: website spec: replicas: 1 # We have one backup replica template: metadata: labels: name: website app: website environment: production spec: containers: - name: website # This image will pull from the docker registry # I have built and pushed this image already image: jules2689/website:v1.03 imagePullPolicy: Always # The container runs the application on port 3000 ports: -...

Caching Paths

Posted on 2017-04-05

Reading time 1

Overview Caching Paths Path Scanner Caching paths is the main function of bootsnap. Previously, I mentioned that Bootsnap creates 2 caches: Stable: For Gems and Rubies since these are highly unlikely to change Volatile: For everything else, like your app code, since this is likely to change This path is shown in the flowchart below. In a number of instances, scan is mentioned. This refers to the operation performed by the Path Scanner. Mtimes (modified times) of files and directories We do not take mtimes into account for stable caches. This is a more expensive operation so we avoid it when we can (this avoids as many filesystem calls as we can). This means for a “stable” cache, we simply use 0 as the mtime for all files, so there is no effect on the cache heuristic. For a “volatile” cache however, we find the maximum mtime of all files and directories in the given path. This means that if any file within a directory is added or removed, the cache is invalidated. Note, the mtime is initialized at -1, so if the path doesn’t exist, -1 will be returned.

braindump

Posted on 2017-04-05

Reading time 0

Things to investigate further tl;dr kubelet - the thing that actually runs stuff - can run from a static manifest, anything you put in /etc/kubernetes, it will run Runit Services Amdahl’s Law CIDR Block IP Address Blocks UDP Hole Punching Websites to read http://basho.com/posts/technical/why-vector-clocks-are-easy/ http://basho.com/posts/technical/why-vector-clocks-are-hard/ http://basho.com/posts/technical/vector-clocks-revisited/ http://basho.com/posts/technical/vector-clocks-revisited-part-2-dotted-version-vectors/ http://valerieaurora.org/hash.html

Bootsnap

Posted on 2017-04-05

Reading time 1

Overview Caching Paths Path Scanner Bootsnap is a library that overrides Kernel#require, Kernel#load, Module#autoload and in the case that ActiveSupport is used, it will also override a number of ActiveSupport methods. Bootsnap creates 2 kinds of caches, a stable, long lived cache out of Ruby and Gem directories. These are assumed to never change and so we can cache more aggresively. Application code is expected to change frequently, so it is cached with little aggression (short lived bursts that should last only as long as the app takes to boot). This is the “volatile” cache. Below is a diagram explaining how the overrides work. In this diagram, you might notice that we refer to cache and autoload_path_cache as the main points of override. These are calculated using the concepts described in Caching Paths.

Julian Nadeau

Notes about various topics. Mostly computer science related.

RSS