Web Images Videos Maps News Shopping Gmail more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Group info
Members: 34
Language: English
Group categories: Not categorized
More group info »
Recent pages and files
1). Data Operation    

 

0).   Sample Data

SampleData.txt (6*5). The last column is the class mark.

1, 0.3, 5, 11, 0

2, 0.2, 3, 12, 1

3, 0.1, 3, 15, 0

4, 0.1, 2, 14, 1

5, 0.2, 3, 12, 1

6, 0.1, 6, 10, 0

 

1).   Data Operation

1.1).       Create Array

load(‘File Path’)         % loads data from a file.

zeros(row, column)    % creates array of all zeros

ones(row, column)     % creates array of all ones

repmat(A, m, n) % creates a large matrix consisting of an m-by-n tiling of copies of A.

reshape(A, m, n)        % reshape array

rand(m, n)          % creates m-by-n matrix of random elements(between 0 and 1)

inf(m,n)                        % creates an m-by-n matrix of infinities.

NaN(m,n)                    % creates an m-by-n matrix of NaNs(Not-a-Number).

[e.g.]         

data = load('c:/my data directory/sampledata.txt')

data =

    1.0000    0.3000    5.0000   11.0000         0

    2.0000    0.2000    3.0000   12.0000    1.0000

    3.0000    0.1000    3.0000   15.0000         0

    4.0000    0.1000    2.0000   14.0000    1.0000

    5.0000    0.2000    3.0000   12.0000    1.0000

    6.0000    0.1000    6.0000   10.0000         0

Notice: Do not leave any space before the file path. load(‘   c:/ my data directory/sampledata.txt’) will cause a “Invalid argument” error.

Notice: zeros(n) or ones(n) or rand(n) will return a n by n matrix. repmat(A, n) creates an n-by-n tiling.

1.2).       Basic Arithmetic & Logic Expression

[e.g.]         

data+1;     % every element increase by 1

data.^2;     % every element square. Do not forget the ‘.’ Before ‘^’.

data>3       % return an index matrix(1 for elements larger than 3; 0 for other cases)

ans =

     0     0     1     1     0

     0     0     0     1     0

     0     0     0     1     0

     1     0     0     1     0

     1     0     0     1     0

     1     0     1     1     0

1.3).       Subset of data

data( [row set], [column set] ). The row set or column set can be ‘:’(all), ‘1:3’(1 to 3), ’[1 3 4]’(1 and 3 and 4), ‘2:end’(2 to the last), or logicals.

 

[e.g.]         

sample = data([1 3], 1:end-1)

sample =

    1.0000    0.3000    5.0000   11.0000

    3.0000    0.1000    3.0000   15.0000

Notice: data(2:end) is quite different from data(:,2:end). Please check the results of data(2:end).

 

[e.g.]

index = data( :, end ) > 0     % return a vector(1 for positive elements; 0 for other cases)

sample = data( index, : )     % return the 2,4,5 rows of data

index =

     0

     1

     0

     1

     1

     0

sample =

    2.0000    0.2000    3.0000   12.0000    1.0000

    4.0000    0.1000    2.0000   14.0000    1.0000

5.0000    0.2000    3.0000   12.0000    1.0000

Notice: we can use other logic expressions such as ’==’(equal), ‘~=’(not equal),  ‘.^2>3’; ‘isNaN()’ and so on

Notice: for concision, we can also write ‘sample = data(data(:,end)==1, :)’, this expression may be widely used in our projects (especially supervised learning).

1.4).       Find indices and values of nonzero elements.

find(data, count, ‘first or last’)

         [e.g.]

         data(find( data( :, 1 ) > 3, 2, 'last') )

         ans =

              5

              6      

         Notice: Please check MATLAB help for more syntax of “find()”.

1.5).       Differences and approximate derivatives

diff(X). If X is a vector, then diff(X) returns a vector, one element shorter than X, of differences between adjacent elements: [X(2)-X(1) X(3)-X(2) ... X(n)-X(n-1)]

Notice: When programming prototype methods, you may find that this function is useful for finding out the most frequent elements in a vector.

[e.g.]

v = [1 1 2 3 4 2 2 1 5 0]

v = sort( v )                                                    % sort v in ascending order

% v =       0     1     1     1     2     2     2     3     4     5

difference = diff( [v v(end) + 1 ] )                

%difference = 1     0     0     1     0     0     1     1     1     1

count = diff( find( [1 difference] ) )              % return every elements’ frequency

% count =     1     3     3     1     1     1

uniqueElements = v ( find( difference ) )    

% uniqueElements = 0     1     2     3     4     5

index = count >= max( count )   

% index = 0     1     1     0     0     0

mostFrequentElement = uniqueElements ( index )

% mostFrequentElement =         1     2

Notice: Most Frequent Element may be more than 1.

1.6).       Some other useful functions

Functions: sum, min, max, mean, median, std, var, sort

These functions are easy to use. But one thing needed to mention is: datasets in MATLAB are column-oriented. That is, all functionsin default, apply on every column respectively if the dataset is a matrix larger than 2-by-2. Or you need to use ‘dim’ parameter to specify the dimension you want to apply the function.

For example, min(A): If A is a vector, min(A) returns the smallest element in A. If A is a matrix, min(A) treats the columns of A as vectors, returning a row vector containing the minimum element from each column. If A is a multidimensional array, min operates along the first nonsingleton dimension.

[e.g.]

min(data)

         ans =

1.0000    0.1000    2.0000   10.0000         0

         min(data, [], 2)

         ans =

         0

    0.2000

         0

    0.1000

    0.2000

         0

1.7).       struct

Create structure array.

[e.g.]

s = struct('strings',{{'hello','yes'}},'lengths',[5 3]);

s.lengths = [5 3 8]; %update the lengths of the struct s.

1.8).        

 

Version: 
Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google