Learning Elixir Through Exercism

So, here’s the thing: I need to learn Elixir and the Phoenix Framework for some job I have right now.

I find that Elixir has many macros to do a bunch of stuff that it doesn’t do out of the box to make the whole transition process from the whole Object Oriented paradigm, much easier.

But when you’re learning something new, I find that the most important thing is to learn the basics really well and worry about productivity later; otherwise you end up developing bad habits, and those are hard to get rid of.

Well, that’s not an easy task. Even on Elixir’s documentation, there’s at least three ways of doing conditionals and they’re probably just macros to  case.

The exercism way

That’s when I found exercism. You can find tons of exercises to try your skills on a new computer language and compare your doings with other people worldwide. You can install a CLI on your computer and fetch the exercises whenever you want to. All the exercises come with tests, and you must use the TDD approach, which is a plus.

The exercises also come with comments on how to do them. And one particular exercise’s comment caught my attention: list-ops.

# Please don’t use any external modules (especially List) in your implementation. The point of this exercise is to create these basic functions yourself. Note that `++` is a function from an external module (Kernel, which is automatically imported) and so shouldn’t be used either.

That’s what I wanted! I wanted to learn the language without worrying about libs!

So, basically I had to implement from scratch these functions:

  • count
  • reverse
  • map
  • filter
  • reduce
  • append
  • concat

And I did:

Elixir

I found out that Erlang’s BEAM ( think JVM for Erlang ) is pretty good at handling recursive calls, and won’t cause a stack overflow.

tail recursion elixir
Functional Programming @ it’s best

And since Elixir is compiled to Erlang object code, it comes with the same benefits out of the box. Neat, huh?

That’s all, folks.

 

ps: I really need to know how to end an article better.

Dynamic Programming

Today I stumbled across a very interesting article from Gareth Rees, in which he suggests that what we call Dynamic Programming to describe a technique used when writing combinatorial algorithms, should be called the Tabular Method instead.

I read that and started wondering:

What the hell is dynamic programming anyway?!

So, I kept reading and found a very well written explanation in the same article, with great examples! There was only a small problem: the examples were written in Python. I’m not that illiterate in Python, but I still get confused whenever decorators and stuff like that are used. So, as an exercise, I’ve translated” some of the article to javascript. Just enough to get the whole dynamic programming ( Tabular Method? ) idea.

Dynamic Programming / Tabular Method

The example that he uses in the article is that of the Fibonacci sequence:

Fibonnaci Sequence

He states that if we program this algorithm in a naïve way:

var f = function(n){
    if (n <= 1) return n;
    else return f(n-1) + f(n-2);
}

It is going to run very slowly:

var start, end, i;
var values = [1, 15, 25, 30, 35, 40];
var result = [];
for(i = 0; i < values.length; i ++){
    start = new Date().getTime();
    f(values[i]);
    end = new Date().getTime();
    result.push((end - start)/1000);
}
console.log(result); //in seconds
>>  [0, 0, 0.031, 0.321, 3.579, 39.858]

If we trace the execution of a small input using console.log, we can see what’s happening:

var f = function(n){
    console.log('Call f(' + n + ')');
    if (n <= 1) return n;
    else return f(n-1) + f(n-2);
}
f(4);
>>
Call f(4)
Call f(3)
Call f(2)
Call f(1)
Call f(0)
Call f(1)
Call f(2)
Call f(1)
Call f(0)

Some of the sub-problems are being calculated more than once: f(2) and f(0) is calculated twice, while f(1) is calculated thrice! You can imagine what happens when the n is big. So, if we store the already calculated results in a table, there is no need to calculate and results more than once! Here is a simple implementation:

var f = (function() {
    var table = { 0: 0, 1: 1 };
    var fib = function(n) {
        var value;
        if (n in table) {
            value = table[n];
        } else {
            value = fib(n - 1) + fib(n - 2);
            table[n] = value;
        }
        return value;
    }
    return fib;
})();

Later, I’ve learned that this method is also called memoization. Do you know the difference? Anyway, now you can try f(1000) and it’s way better than the previous way we were doing things.

Twitter API using node/javascript

Today I’m going to talk about how to search using the Twitter API search with node/javascript.

why?

Well, I tried messing with this for one reason: sentiment analysis. I’m creating a dictionary that has all major adjectives in portuguese, followed by a number that represents the adjective’s popularity on TwitterAfter I have that, I can start classifying all the adjectives as good/bad/neutral. It’s going to be a basic dictionary, but for that, I need to filter all the adjectives. Portuguese language has about 8k adjectives ( only talking about the masculine form ). Imagine if I had to classify all those manually… It would take ages! That’s why im using this approach to figure out all adjectives that are mentioned on Twitter: to keep it small.

Ok, lez do this!

Right now, our dictionary looks something like this:

à-toa
aacheniano
aalénio
aalênio
aarónico
aarônico
ab-reativo
ab-rogado
ab-rogador
...

It’s still just a list of adjectives. But we are going to change that. First, let’s install the javascript wrapper for twitter:

npm install twitter

This is a great wrapper library, so if you want to learn more about it, I suggest you take a further look at their git repository.

After that, we need to register our app on Twitter. We do this by going to https://apps.twitter.com/ and clicking the button to register a new app.

After you have a new app registered, you can access the following info:

consumer_key
consumer_secret
access_token_key
access_token_secret

This info is necessary to consume the Twitter api. If you need more info on this subject, make sure you check this page.

Cool! Now what?

Now we can start our app:

var Twitter = require('twitter');
 
var client = new Twitter({
consumer_key: 'xxxxxxxx',
consumer_secret: 'xxxxxxxx',
access_token_key: 'xxx-xxxxx',
access_token_secret: 'xxxxxxxx'
});

Make sure that you change all the x’s with your app’s information.

Now that we have that configured, we make the call like this:

client.get('/search/tweets.json', 
          {q:'WhatYouLookinFor'},
          function(error, params, response) {});

So, this function has 3 arguments:

  • The first argument is the api call. You can find more about that here. But basically, thats the REST API call in a string format.
  • The second argument is an object containing the API parameters. That argument is optional, but in this case, we need it.
  • The third argument, is a callback function that it’s going to get called after the API call succeeds ( or fails ).

If you want to access the return object, just watch the params argument inside the callback function. That’s the JSON object that gets returned! Or you can have the raw http response object as well: that’s the third argument of the callback function.

One thing about the Twitter’s search API is that it’s limited for about 180 calls in a 15 minute window, so beware of  how many calls you make or you might get an error ( first argument of the callback function )! Our app is making 1 call each 5 seconds.

So, basically, this is the final code. It takes all words on a file, tracks how much it’s being mentioned on Twitter ( on a scale that goes from 0 to 15 ), and appends it to a new file. It does that every 5 seconds:

var Twitter = require('twitter');
var fs = require('fs');
var client = new Twitter({ consumer_key: 'xxxxxxxx', consumer_secret: 'xxxxxxxx', access_token_key: 'xxx-xxxxx', access_token_secret: 'xxxxxxxx' });

var find = function(i, array) {
    if (array.length > i) {
        client.get('/search/tweets.json', {
            q: array[i]
        }, function(error, params, response) {
            if (error) throw error;
            console.log(array[i] + ':', params.statuses.length);
            if (params.statuses.length !== 0) {
                fs.appendFile('new.txt', array[i] + ':' + params.statuses.length + '\n', function(err) {
                    i++;
                    console.log('Saved to file');
                    setTimeout(function() {
                        console.log('calling twitter');
                        find(i, array);
                    }, 5000);
                });
            } else {
                i++;
                setTimeout(function() {
                    console.log('calling twitter');
                    find(i, array);
                }, 5000);
            }
        });
    }
};

fs.readFile('dicionarioAdjetivo.txt', {
    encoding: 'utf-8'
}, function(err, data) {
    if (err) throw err;
    var i = 0;
    find(i, data.split('\n'));
});
end result

I had to keep this running for about 11 hours ( because of the calling rate restrictions ).

I’ve realized that all those adjectives that were on the original file ( about 8k ) were reduced by 1.5k. That’s not much, and not at all what I’ve expected, but now I can filter this file once more just to get the adjectives that were mentioned at least 15 times on Twitter. To acomplish this, I just created a small program that does that:

var fs = require('fs');

var fifteen = function(i, arr) {


    if (i < arr.length) {
        var quant = parseInt(arr[i].slice(arr[i].indexOf(':') + 1));
        if (quant === 15) {
            fs.appendFile('15.txt', arr[i].slice(0, arr[i].indexOf(':')) + '\n', function(err) {
                i++;
                fifteen(i, arr);

            });
        } else {
            i++;
            fifteen(i, arr);
        }
    }
};

fs.readFile('new.txt', {
    encoding: 'utf-8'
}, function(err, data) {
    if (err) throw err;
    var i = 0;
    fifteen(i, data.split('\n'));
});

and the end result is:

à-toa:15
abaetê:15
abafado:15
abafador:15
abaixado:15
abalado:15
...
considerations

Well, I have a couple observations that I would like to share about the whole Twitter API javascript experience:

  1. About the seed dictionary: It’s not that easy to find an all adjectives compilation of a determined language. Or maybe I just didn’t do my research very well. What I’ve done was, I created a crawler to go through a webpage called something like “allportugueseadjectives.com.br”, and had to gather all those adjectives in a file. For that I used a simplified http request library called ( yes, you guessed it ) request. I also used some regex and string manipulation to crawl stuff from the website.
  2. About the 0 to 15 mention scale: It happens that Twitter only displays 15 results at once for a query. If you want to see the other results, you might have to make another query using some sort of API pagination. The intent for this experience is to mess a little bit with the Twitter API using javascript, and I believe it wasn’t that necessary to get the accurate mention number. If it’s mentioned at least 15 times, it’s good enough to keep on the final dictionary, don’t you think so? I do.

When I finish classifying all these adjectives, I’ll be sharing the result on github, if anyone’s interested. Cya, guys!