MRO has an upgraded BLAS and LAPACK via the Intel MKL. These are the math libraries that R uses for computations.Intel hired software engineers and computational mathematicians to develop the libraries to run parallel computations. The standard R math libraries are pretty bad by comparison.
If you want to have some fun, see how good MRO is compared to R, create a 10,000x10,000 matrix and multiply it by itself. look at the system time for the calculation using MRO and R. MRO can do in seconds what R does in minutes to hours.
If you use Microsoft R Server, you can run data in seconds that is simply impossible to run with standard R.
I ran this little script in R-Studio using MRO just to show the difference. (With results)
> require("expm")
> a=5000
> b=0.000000001
> c=matrix(0,a,a)
> for ( i in 1:a){c[i,i] = 1-1*b}
> for ( i in 1:a){c[i-1,i] = b}
> c[a,a]=1
> c[a,a-1]=0
> system.time(c%^%2)
user system elapsed
13.25 0.18 3.68
> setMKLthreads(1)
**** This is some code that only works with MRO. I tells the BLAS and LAPACK to run things on 1 thread (core). Normally, this will be 2 cores with an I3 or I5 processor and 4 cores on an I7 processor. >
> system.time(c%^%2)
**** This is the result of using 1 core, just like standard R. user system elapsed
10.70 0.13 10.58
When I ran the code in standard R, I got this:
> system.time(c%^%2)
user system elapsed
80.67 0.17 80.85
So, a restricted MKL runs the code 8 times faster than standard R. When it uses the 4 cores, it's 21 times faster. That's how good the MKL BLAS and LAPACK are... or how bad the standard R BLAS and LAPACK.
On one of my home computers, I run this bit of code in a larger program but with a=70,000 to 80,000 and perform matrix multiplication 60 times. It takes my home computers with 128GB of ram and 10 cores to 16 cores about a day. I won't even try doing this with R. If you set a=10,000 in standard R, you'll see how long that takes.
Why does this matter? If you took a class in algorithm design, you learned about Big O notation. Squaring an NxN matrix uses a little over N^3 computations. So, 10,000x10,000 uses 10,000^3 (10^12) computations. If you have 100,000 tuples by 40 columns of data, the software will make over 100,000x40x100,000 (4.0x10^11) computations. Since matrix multiplication is the core of regression computation using the best LAPACK and BLAS is a good idea. Sometimes, it's a necessity.
Since MRO is R, just better, I won't even touch standard R. (unless I need to run a benchmark to show how bad it is;-) If you do analytics with large data sets, MRO is it.
Another nice thing about MRO, you can use is as a substitute for MatLab and out perform MatLab. Oddly, MatLab uses the same math libraries as MRO.
Hope this helps.
------------------------------
Andrew Ekstrom
Statistician, Chemist, HPC Abuser;-)
------------------------------
Original Message:
Sent: 07-12-2017 12:55
From: Filiep Samyn
Subject: Elephant
I have noticed some comments about MRO and am especially puzzled when I read that MRO is far better than R. I looked on the MSFT site
https://mran.microsoft.com/rro/ but could not really see what the advantage would be to use MRO versus R. MSFT is usually good at advertising their products but the extra explanation gives no indication that MRO is better than R.
Any experiences that you care to share about using MRO would be most helpful.
Filiep