Thursday 3 October 2013

How to speed up Matlab execution?

Simple ways to speed up the Matlab program 

1. Use 'mlint' command to do the basic rule check -- This will give some hints.

2. If physical memory (RAM) is available, pre-allocate the memory for variables.
Ex :  If there is an array which grows in a for-loop to say 100x350. Then before the loop, initialize the array as, big_array = zeros(100,350);

3.  Use default Matlab function to the maximum possible extent. Those functions are written in C and they run much faster.

4.  Use array/vector operations instead of for-loops. It greatly reduces the time.
Example :
a) To generate a sine-wave of frequency fm sampled with fs for a duration of T.

clc;
clear;

fm = 1e6;
fs  = 10e6;
T = 1;

%Logic-1 (Faster)
tic;
t = 0 : (1/fs) : T-(1/fs);
s = sin(2*pi*fm*t);
toc

%Logic-2 (slow)
tic
for i = 1:1:(fs*T)
   s(i) = sin(2*pi*fm*i/fs);
end
toc

------Matlab output in 'seconds'----
logic1 =

    0.4162

logic2 =

    2.0926

b) Use 'find', 'reshape' commands to avoid loops. They are really useful.
>>help find

>> help reshape

5. Split the logic into sub-modules. And for every sub-modules write a function and use. This helps save the  memory (when the control returns from the function -- memory used by variables in those functions will be released automatically). This efficient memory usage in turn reduced the time of execution.

6.  Next is to use 'parfor' if there are loops where the iterations are independent of each other.
a. This requires, Parallel computing toolbox.
b. Easy way to use 'parfor' is by calling user written 'functions' which doesn't require input from previous index.
c. For this to work, type in
>> matlabpool open

Once Matlab detects the no.of.cores in the PC, it parallely executes the functions in the for-loop for every index.
d. Run you program with 'parfor'

7. In cases, where for-loops can't be avoided. Write those functions in C program and interface (MEX) those files with the Matlab. These functions are called in the same way as that of the normal Matlab functions. It is just that, they are pre-compiled using MEX. Then Matlab just runs this executable.

8. PLOT command is slow in expanding the array.
Example : t= 1:1:1000;
a = 10.23; %Slower for the floating point numbers
plot(t,a) ; % this takes more time than the one below.

a_arr = a * ones(1,length(t));
plot(t,a_arr);

9. Use 'single' precision than 'double' precision numbers --- It will double the speed
a. Matlab's default type is 'double' which takes 8 bytes of memory for every number that is created.
b. In case the program doesn't require the 64-bit precision, it can be changed to 'single' precision as given below:
t = 1:1:100; %(it is in double precision)
whos %this command lists all the variables in the workspace with their size & type information

y = single(t); %converting from double to single
whos % check the size of the y

10. Use array operations
a. Generating 40 complex sinusoids each with 20000 sample points
fIf = 1e6;
noOfFreqBins = 40;
freqSearchStepSize=100;
fSamp = 20e6;
codePeriod = 1e-3;
t  = single(0:1/fSamp:codePeriod-(1/fSamp));

carFreq1 = single((fIf - (noOfFreqBins/2-(1:1:noOfFreqBins))*freqSearchStepSize));
carFreq =  (repmat(carFreq1,length(t),1))'; %repeating the matrix
tt = repmat(t,length(carFreq1),1);
locSignal.carIq = single(exp(-1i*2*pi*carFreq(1:noOfFreqBins,:).*tt));

>> size(locSignal.carIq)
40  20000

b. If we had used for-loop it takes more time. Same idea can be used for all the other functions such as FFT, IFFT, etc. It runs really faster. However, this method requires more memory, as it has to store the entire array of many variables.

c. One thing to note here is that, this approach uses all the 'available CPU cores' even without using the 'matlabpool open' command.

This indicates that, Matlab might executes all the array operations in parallel.

In a particular program with both 'array operations' and 'parfor' command following observations were made, (in both the cases, 'single' precision is used)
  1. only 'array operations' and normal 'for' loop -- 20 seconds
  2. same 'array operations' with 'parfor' loop -- 40 seconds -- looks like here switching time to share the memory is more because of large arrays . This method performs better than normal 'for' loop when the 'arrays' are of less size.






No comments:

Post a Comment