3 tips to make you a better developer

four20puzzle

introduction

I’m sorry for the clickbait headline, I didn’t have a better idea/name for it.

We (developers) occasionally produce lazy/messy code and from time to time we need to remember the most important rule: “We do code to solve problems but also for human being be able: to use, to maintain and to evolute”. 

TLDR; (a unit can be a: function, var, method, class, parameter and etc)

  1. Naming your units with care and meaning;
  2. Try to see your code as a series of transformation;
  3. When possible make yours units generic.

Keep in mind that these tips are just my opinions and at the best they were based on: excellent books (Refactoring, DDD, Clean Coder and etc ), articles & blog posts,  excellent people I’ve worked/paired with,  presentations,  tweets and experiences.

naming is hard

Name your units with care and meaning. Your code should be easy to understand.

Although naming things is really hard, it is also extremely important. Let’s a see a snippet of code:


var topComments = (id) => {
var succCB = (d) => {
var a = d.data.comments
var top = []
a.sort((d1, d2) => {
return new Date(d1.date) new Date(d2.date)
})
a.forEach((c) => {
if (c.isTop()) {
top.push(c)
}
}
app.topComments = top.slice(0,10)
}
var errCB = (e) => {
this.sendError(e)
app.topComments = []
}
this.ajax(`/all/${id}/comments/`, succCB, errCB)
}

Let’s discuss about this code above:

  • the function topComments receives an id but is it the id from the comment, user, article? Let’s say it’s form the user, therefore userId should vanish this doubt.
  • the name of the function is topComments but it looks like it’s getting the top 10 latest comments only thus we could call it top10LatestCommentsFrom.
  • the ajax function accept two callbacks one in case of success (succCB) and otherwise an error (errCB), I believe we can call them: onSuccess and onError for better understanding.
  • all the arguments are using short names and we can have less confusing names just by using the entire name.
  • you got the ideia, naming things to let the code clear!


var top10LatestCommentsFrom = (userId) => {
var onSuccess = (rawData) => {
var topLatestComments = []
var allComments = rawData.data.comments
allComments = allComments.sort((date1, date2) => {
return new Date(date1.date) new Date(date1.date)
})
allComments.forEach((comment) => {
if (comment.isTop()) {
topLatestComments.push(comment)
}
}
app.topComments = topLatestComments.slice(0,10)
}
var onError = (error) => {
this.sendError(error)
app.topComments = []
}
this.ajax(`/all/${userId}/comments/`, onSuccess, onError)
}

Although we still have so many problems in this code, now it’s easier to understand and we only named things properly.

For sure there are some cases when short names are just okay, for example: when you’re developing an emulator or virtual machine you often use short names like sp (stack pointer) and pc (program counter) or even doing a very generic unit.


class LR35902 {
init() {
this.pc = this.sp = 0x0000
this.a = this.b = this.c = this.d = this.e = this.f = this.f = this.h = this.l = 0x00
}
execute() {
var opCode = memory.read(this.pc)
this.perform(opCode)
this.pc += 2
}
}

view raw

cpu.js

hosted with ❤ by GitHub

filter -> union -> compact -> kick

Try to see and fit your code as transformations, one after another.

Some say that in computer science almost all the problems can be reduced to only two major problems: sort and count things (plus doing these in a distributed environment), anyway the point is: we usually are coding to make transformation over data.

For instance our function top10LatestCommentsFrom could be summarized in these steps:

  1. fetch comments (all)
  2. sort them (by date)
  3. filter them (only top)
  4. select the first 10

Which are just transformations over an initial list, we can make our function top10LatestCommentsFrom much better with that mindset.


var onError = (error) => this.sendError(error)
var byDate = (date1, date2) => new Date(date1.date) new Date(date1.date)
var onlyTops = (comment) => comment.isTop()
var top10LatestComments = (rawData) => {
app.topComments = rawData.data.comments
.sort(byDate)
.filter(onlyTops)
.slice(0, 10)
}
var userId = 68
ajax(`/${userId}`, top10LatestComments, onError)

 

By the way this could lead you to easily understand the new kid on the block sometimes referred as Functional Reactive Programming.

<be generic>

Work to make your units generic.

Let’s imagine you are in an interview process and your first task is to code a function which prints the numbers 1, 2 and 3 concatenated with “Hello, “. It should print: “Hello, 1” and then “Hello, 2″…


var printNumbers = () => {
[1,2,3].forEach((number) => console.log(`Hello, ${number}`))
}
printNumbers()

view raw

interview1.js

hosted with ❤ by GitHub

Now they ask you to print also the letters: “D”, “K” and “C”.


var print = (list) => {
list.forEach((number) => console.log(`Hello, ${number}`))
}
print([1,2,3])
print(["D","K","C"])

view raw

print2.js

hosted with ❤ by GitHub

It was the first step toward the “generic”, now the interviewers say you have also to print a list of person’s name but now it’ll be a list of objects [{name: “person”},…].


var print = (list) => {
list.forEach((item) => {
// naming become harder 😦
var itemDescription
if (typeof item === "object") {
itemDescription = item.name
} else {
itemDescription = item
}
console.log(`Annyong, ${itemDescription}`)
})
}
print([1,2,3])
print(["d","k","c"])
print([{name: "Buster Lose Seal"}, {name: "Neo Cortex"}])

view raw

print-person.js

hosted with ❤ by GitHub

Things start to get specific again and the interviewers want to test you. They ask you to print a list of car’s brand [{brand: “Ferrari”}, ..] plus a list of game consoles with their architecture [{name: “PS4”, arch: “x86-64”}, …]


var print = (list) => {
list.forEach((item) => {
var itemDescription
if (typeof item === "object") {
itemDescription = item.name || item.brand
if (item.arch) {
itemDescription = `${item.name}${item.arch}`
}
} else {
itemDescription = item
}
console.log(`Hello, ${itemDescription}`)
})
}
print([1,2,3])
print(["D","K","C"])
print([{name: "Buster Lose Seal"},{name: "Neo Cortex"}])
print([{brand: "Ferrari"}, {brand: "Mercedes"}])
print([{name: "N64", arch: "MIPS"}, {name: "3DS", arch: "ARM9"}])

view raw

print5.js

hosted with ❤ by GitHub

Yikes, I suppose you’re not proud of that code and probably your interviewers will be little concerned about your skills with development, let’s list some of the problems with this approach.

  • Naming (we’re calling a person of an item)
  • High coupling (the function print knows too much about each printable)
  • Lots of (inner) conditionals 😦 it’s really hard to read/maintain/evolute this code

What we can do?! Well, it seems that all we need to do is to iterate through an array and prints an item but each item will require a different way of printing.


var defaultPrint = (item) => console.log(`Hello, ${item}`)
var myForEach = (list, printFunction = defaultPrint) => {
list.forEach((item) => printFunction(item))
}
myForEach([1,2,3])
myForEach(["D","K","C"])
myForEach([{name: "Tomba!"},{name: "Neo Cortex"}], (person) => defaultPrint(person.name))
myForEach([{brand: "Ferrari"}, {brand: "Mercedes"}], (car) => defaultPrint(car.brand))
myForEach([{name: "N64", arch: "MIPS"}, {name: "3DS", arch: "ARM9"}], (console) => defaultPrint(`${item.name}${item.arch}`))
// in fact we could even use `Array.prototype.forEach` function and avoid the duplication with `myForEach`
// (which does basically what forEach does)
// var forEach = Array.prototype.forEach
// forEach.call([{name: "Tomba!"},{name: "NeoCortex"}], (person) => defaultPrint(person.name))

view raw

cleaner.js

hosted with ❤ by GitHub

 

I said naming is important but when you make something very generic you should also make the abstract names not tied to any concrete concept. In fact, in Haskell (let’s pretend I know Haskell) when a concrete type of something may vary we use single letters to take their place.


function makelist(x) {return [x]}
makelist(document.head[0])
makelist("DKC:TF")
makelist("6502")
makelist(0xf13e455d7a1b96cd2d930e578284d889)

Bonus round

  1. Make your units of execution to perform a single task.
  2. Use dispatch/pattern matching/protocol something instead of conditionals.
  3. Enforce DRY as much as you can.

presentation – Live Video Platform for FIFA World Cup


In this talk, we will describe globo.com’s live video stream architecture, which was used to broadcast events such as the FIFA World Cup (with peak of 500K concurrent users), Brazilian election debates (27 simultaneous streams) and BBB (10 cameras streaming 24/7 for 3 months) .

NGINX is one of the main components of our platform, as we use it for content distribution, caching, authentication, and dynamic content. Besides our architecture, we will also discuss the Nginx and Operational System tuning that was required for a 19Gbps throughput in each node, the open source Cassandra driver for Nginx that we developed, and our recent efforts to migrate to nginx-rtmp.

Great sources of info

Standing on the shoulders of giants

And for sure HN and SO.

FIFA 2014 World Cup live stream architecture

live_stream_nginx We were given the task to stream the FIFA 14 World Cup and I think this was an experience worth sharing. This is a quick overview about: the architecture, the components, the pain, the learning, the open source and etc.

The numbers

  • GER 7×1 BRA (yeah, we’re not proud of it)
  • 0.5M simultaneous users @ a single game – ARG x SUI
  • 580Gbps @ a single game – ARG x SUI
  • =~ 1600 watched years @ the whole event

The core overview

The project was to receive an input stream, generate HLS output stream for hundreds of thousands and to provide a great experience for final users:

  1. Fetch the RTMP input stream
  2. Generate HLS and send it to Cassandra
  3. Fetch binary and meta data from Cassandra and rebuild the HLS playlists with Nginx+lua
  4. Serve and cache the live content in a scalable way
  5. Design and implement the player

If you want to understand why we chose HLS check this presentation only in pt-BR. tip: sometimes we need to rebuild some things from scratch.

The input

The live stream comes to our servers as RTMP and we were using EvoStream (now we’re moving to nginx-rtmp) to receive this input and to generate HLS output to a known folder. Then we have some python daemons, running at the same machine, watching this known folder and parsing the m3u8 and posting the data to Cassandra.

To watch files modification and to be notified by these events, we first tried watchdog but for some reason we weren’t able to make it work as fast as we expected and we changed to pyinotify.

Another challenge we had to overcome was to make the python program scale to x cpu cores, we ended up by creating multiple Python processes and using async execution.

tip: maybe the best language / tool is in another castle.

The storage

We previously were using Redis to store the live stream data but we thought Cassandra was needed to offer DVR functionality easily (although we still uses Redis a lot). Cassandra response time was increasing with load to a certain point where clients started to timeout and the video playback completely stopped.

We were using it as Queue-like which turns out to be a anti-pattern. We then denormalized our data and also changed to LeveledCompactionStrategy as well as we set durable_writes to false, since we could treat our live stream as ephemeral data.

Finally, but most importantly, since we knew the maximum size a playlist could have, we could specify the start column (filtering with id > minTimeuuid(now – playlist_duration)). This really mitigated the effect of tombstones for reads. After these changes, we were able to achieve a latency in the order of 10ms for our 99% percentile.

tip: limit your queries + denormalize your data + send instrumentation data to graphite + use SSD.

The output

With all the data and meta-data we could build the HLS manifest and serve the video chunks. The only thing we were struggling was that we didn’t want to add an extra server to fetch and build the manifests.

Since we already had invested a lot of effort into Nginx+Lua, we thought it could be possible to use lua to fetch and build the manifest. It was a matter of building a lua driver for Cassandra and use it. One good thing about this approach (rebuilding the manifest) was that in the end we realized that we were almost ready to serve DASH.

tip: test your lua scripts + check the lua global vars + double check your caching config

The player

In order to provide a better experience, we chose to build Clappr, an extensible open-source HTML5 video player. With Clappr – and a few custom extensions like PiP (Picture In Picture) and Multi-angle replays – we were able to deliver a great experience to our users.

tip: open source it from day 0 + follow to flow issue -> commit FIX#123

The sauron

To keep an eye over all these system, we built a monitoring dashboard using mostly open source projects like: logstash, elastic search, graphite, graphana, kibana, seyren, angular, mongo, redis, rails and many others.

tip: use SSD for graphite and elasticsearch

The bonus round

Although we didn’t open sourced the entire solution, you can check most of them:


Discussion / QA @ HN

RSpec and Watir to test web applications

Testing is cool

Software testing

Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software implementation. Test techniques include, but are not limited to, the process of executing a program or application with the intent of finding software bugs (errors or other defects) – Wikipedia

The main intend of this post is, introduce you to UI tests over some ruby toys. In fact you could create an entire project (new) in ruby just to test your legacy web project. It’s cool, you can learn new language and work for the improvement of your legacy product. If you are totally new for ruby maybe a ruby overview can help you. (or might confuse you more)

Installing ruby, watir and rspec

Instead of installing the ruby directly, we are going to install the RVM (Ruby Version Manager) to then install any ruby we need. The steps described here were made on Ubuntu 11.04. On your terminal do the magic to install RVM.

bash < <(curl -s https://rvm.beginrescueend.com/install/rvm)
echo ‘[[ -s “$HOME/.rvm/scripts/rvm” ]] && . “$HOME/.rvm/scripts/rvm” # Load RVM function’ >> ~/.bash_profile
source .bash_profile

And from now on, your life will be better on ruby interpreters versions. Let’s install the ruby 1.9.2. (terminal again)

rvm install 1.9.2

And if we want to see the rubies installed on our machine?

rvm list

And now, how can we chose one ruby to work on the terminal session?

rvm use 1.9.2

For the test purpose we will use Watir and RSpec, great tools for testing, make fun with BDD and the best thing is install them it’s very easy.

gem install watir-webdriver
gem install rspec

Hands-on

Since we have all things installed, we can move for the example. The feature I want to test is the search system of  Amazon. Being more precise, I want to search for ‘Brazil’ and see if the ‘Brazil on the Rise’ is within the results as I want to be sure when I search for ‘semnocao‘ the Amazon doesn’t provide any result. Now, we can write the spec.

require 'amazon_page'

describe AmazonPage do
 before(:each) do
   @page = AmazonPage.new
 end
 after(:each) do
   @page.close
 end
 it "should show 'Brazil on the Rise' when I query for [Brazil]" do
  @page.query 'Brazil'
  @page.has_text('Brazil on the Rise').should == true
 end
 it "should bring no result when I search for [semnocao]" do
  @page.query 'semnocao'
  @page.results_count.should == 0
 end
end

The specification is very simple, it will create a page before each test calling and close the page after each test calling. There is only two tests: test when you search for Brazil  and  when you search for semnocao. We will design the tests using page object pattern. The class bellow is the page which represents the Amazon page and all testable behaviors should be inside of it.

require 'watir-webdriver'

class AmazonPage
 def initialize
  @page = Watir::Browser::new :firefox
  @page.goto 'http://www.amazon.com'
 end
 def close
  @page.quit
 end
 def query(parameter)
  @page.text_field(:id=>'twotabsearchtextbox').set parameter
  @page.send_keys :enter
 end
 def has_text(text)
  @page.text.include? text
 end
 def results_count
    if @page.text.include? 'did not match'
     0
    else
     @page.div(:id=>'resultCount').text.split(' ')[5].gsub(',','').to_i
    end
 end
end

To run this you just need to type on your terminal.

rspec spec/

The final code can be downloaded or viewed at github.

Additions

  • We could improve our story readibility  with Cucumber.
  • We could send the browser execution to an Xvfb server. (A.K.A. running headless) The browser pops up really bothers me.
  • We could integrate it with our CI.
  • We could design a base Page class for provide common operations as mixin or something

ps: the post was very inspired by KK post and Saush.