Thursday, June 25, 2015

how to shuffle sort mapreduce

Shuffling is the process by which intermediate data from mappers are transferred to 0,1 or more reducers. Each reducer receives 1 or more keys and its associated values depending on the number of reducers (for a balanced load). Further the values associated with each key are locally sorted.

http://stackoverflow.com/questions/22141631/what-is-the-purpose-of-shuffling-and-sorting-phase-in-the-reducer-in-map-reduce

















how to Capybara


  • sudo apt-get update

sudo apt-get install xvfb
sudo apt-get install x11-xkb-utils
sudo apt-get install xfonts-100dpi xfonts-75dpi xfonts-scalable xfonts-cyrillic
sudo Xvfb :10 -ac

java -version
sudo apt-get install default-jre


The answer is simple, use nohup command line-utility which allows to run command/process or shell script that can continue running in the background after you log out from a shell:
The syntax is as follows
nohup command-name &


testing your Ruby code with Rspec,
We’re going to look at how to can use Rspec to do TDD in Ruby.
Rspec is one of the best frameworks for testing in Ruby,


Selenium is a automated web testing framework
Selenium IDE which is just record and replay macro
 It is very limited in functionality and does not scale well if you want to deploy in multiple servers.
 Selenium WebDriver which is flexible and let you run selenium headless in servers with no display.

 If you want to create some robust process automation which needs to run 24X7 and you need reliability, then your only choice is to have Selenium in a server. But in order to run, Selenium needs to launch a browser. If there are no display to the machine, the browsers are not launched. So in order to use selenium, you need to fake a display and let selenium and the browser thinks they are running in a machine with a display.

 configuring and running selenium headless in Ubuntu using Mozilla Firefox as our primary browser


 JavaScript Testing with Selenium & Capybara-Webkit
 By default Capybara uses Rack::Test which is a headless browser emulator
 If you need to test JS as part of your integration suite, then you need to use another driver.

 you’ll need to set up xvfb in order to use either Selenium or Capybara-Webkit
 Now selunium and capybara-webkit will use xvfb when launching a browser.

 There are alternative x-servers and alternative ways to use the x-server from the specs (headless gem)
 It’s possible to use Chrome or another WebKit-based browser

 http://tutorials.jumpstartlab.com/topics/capybara/capybara_with_selenium_and_webkit.html

Behaviour Driven Development for Ruby

Behaviour Driven Development for Ruby
http://rspec.info/

Rspec

It’s pretty easy to install Rspec. Pop open that command line and run this:
gem install rspec
http://code.tutsplus.com/tutorials/ruby-for-newbies-testing-with-rspec--net-21297

RVM - Ruby enVironment Manager


  • Without RVM, it’s pretty difficult to have more than one version of Ruby on your computer.

RVM stands for Ruby enVironment Manager.

http://code.tutsplus.com/tutorials/why-you-should-use-rvm--net-19529


  •  Postgres uses information from the operating system to determine the language and encoding of databases

 sudo /usr/sbin/update-locale LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8

 PostgreSQL is the database of choice in the Ruby community
sudo apt-get install postgresql libpq-dev

 Creating the Database Instance & Adding a User

  2  ls /usr/lib/postgresql/9.3/bin/initdb -D /usr/local/pgsql/data
    3  createuser vagrant

sudo mkdir -p /usr/local/pgsql/data
sudo chown postgres:postgres /usr/local/pgsql/data
sudo su postgres
/usr/lib/postgresql/9.3/bin/initdb -D /usr/local/pgsql/data
createuser vagrant

There are several options for managing Ruby versions, but we’ll use RVM with the standard "single user" method.
From your SSH session, we first need to install the curl tool for fetching files, then can use a script provided by the RVM team for easy setup:

sudo apt-get install curl
\curl -sSL https://get.rvm.io | bash
source /home/ubuntu/.rvm/scripts/rvm

The RVM tool has an awesome tool for installing all the various compilers and packages you’ll need to build Ruby and common libraries

Then install both Ruby 1.9.3 and 2.1
rvm install 1.9.3
rvm install 2.1

You can set either as your default Ruby. For 2.1
rvm use 2.1 --default
rvm use 1.9.3 --default

verify it:
which ruby
ruby -v

Rails’ Asset Pipeline needs a JavaScript runtime. There are several options, but let’s install NodeJS:
sudo apt-get install nodejs

http://tutorials.jumpstartlab.com/topics/vagrant_setup.html

how to Gemfile


  • let’s create a project folder and throw this in a Gemfile:


source :rubygems
gem "sinatra"
gem "shotgun"
gem "cucumber"
gem "capybara"
gem "rspec"

bundle install

Optionally—if you’re using RVM—you could install these gems for this project only by running rvm --rvmrc --create 1.9.2@cucumber_example;
run this before bundle install)

http://code.tutsplus.com/tutorials/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber--net-21446


  • you can just google for whatever functionality you’re looking for. Once you find the gem, install it like this

gem install GEM_NAME

 If you’d like to upgrade, run
 gem update --system
 gem -v

 sudo gem install maruku
 sudo gem install aws-s3

 There are two ways you can use gems.
 Some are stand-alone ruby programs that you’ll run (most often from the command line) to do something.
 Then, there are gems that you’ll only use from inside projects
  Ruby doesn’t load everything by default, so you can use require to load extra libraries you want to use
 
  Once you build a project, you might want to share it, or use it on another computer. However, anyone else who runs it will need to have all the right gems installed
 
  Bundler is a gem itself; you can install it by running
  sudo gem install bundler.
 
  in the root of your project, create a file named Gemfile. This will declare what gems you need for this project.
  The first line(s) of your Gemfile will tell Bundler where to get your gems. Gems live in online repositories,


http://code.tutsplus.com/tutorials/ruby-for-newbies-working-with-gems--net-18977

tools


  • Gherkin is the language that Cucumber understands. It is a Business Readable, Domain Specific Language that lets you describe software’s behaviour without detailing how that behaviour is implemented.

Gherkin serves two purposes — documentation and automated tests
https://github.com/cucumber/cucumber/wiki/Gherkin


  • Rspec is a story runner tool, Cucumber is a BDD feature flow explanation

and Capybara is the automation library where all the elements and
pre-defined objects are found. All together Capybara, Rspec and Cucumber
allows an user to automate the scripts and run them.
http://code.tutsplus.com/tutorials/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber--net-21446

how to create standalone maven project from command prompt

Windows
Set the environment variable
JAVA_HOME to C:\Program Files\Java\jdk1.6.0_21

Windows
Append the string ;%JAVA_HOME%\bin to the end of the system variable, Path.


Windows
Set the environment variables using system properties.
M2_HOME=C:\Program Files\Apache Software Foundation\apache-maven-2.2.1
M2=%M2_HOME%\bin
MAVEN_OPTS=-Xms256m -Xmx512m

Windows Append the string ;%M2% to the end of the system variable, Path.

C:\Users\jupiter>java -version
java version "1.8.0_31"
Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
Java HotSpot(TM) Client VM (build 25.31-b07, mixed mode)

C:\Users\jupiter>mvn -version
Apache Maven 3.3.1 (cab6659f9874fa96462afef40fcf6bc033d58c1c; 2015-03-13T22:10:2
7+02:00)
Maven home: C:\Downloads\apache-maven
Java version: 1.8.0_31, vendor: Oracle Corporation
Java home: C:\Program Files (x86)\Java\jdk1.8.0_31\jre
Default locale: fr_FR, platform encoding: Cp1252
OS name: "windows 7", version: "6.1", arch: "x86", family: "dos"

mvn archetype:generate
306
com.mytutos.maven.examples
test11