Support Articles
Technical Computing
Using the REGEXP Regular Expression Matching Function
The REGEXP function is a little-known but extremely powerful MATLAB function, which can be used for matching regular expressions. A regular expression is a string that describes or matches a set of strings, according to certain syntax rules. Regular expressions are used by many text editors and utilities to search and manipulate bodies of text based on certain patterns.
Click here for more information on regular expressions.
This example shows how you can find files with the .m extension in a particular directory. The standard help for the REGEXP function can be seen be typing "help regexp" or "doc regexp" at the MATLAB command line.
% Get current directory contents
generalFunctions = dir;
% Get all the filenames, fileNames is a cell array of file names.
fileNames = {generalFunctions.name}
% We now use the regexp function to find the file names with a .m
% extension. The expression '.\.m\>' is interpretted as follows:
%
% 1) The first '.' means 'match any character'. We only want names with a
% .m extension so we dont care what the rest of the file name is.
%
% 2) The '\.m' expression indicates '.m'. The backslash is necessary to
% specify a fullstop rather than the before mentioned 'match any
% character' specifier.
%
% 3) The '\>' expression means 'the ".m" string must be at the END of the
% expression'. The is necessary so that we can dont return, for example,
% files with an extension of .mat.
%
% A cell array which contains the position of the .m in the file name (if
% the name has a .m extension) or an empty cell is returned by REGEXP.
regexpResult = regexp(fileNames,'.\.m\>')
% We want to convert regexpResult into a logical index so that we can find
% the .m files. We use the CELLFUN function with the 'isempty' option in
% order to convert the 'regexpResult' cell array into a logical index.
%
% 'logicalIndex' is a vector of logical values (ones and zeros) and is the
% same length as 'fileNames'. A value of '1' indicates where a file name with
% a .m extension exists in the fileNames cell array. A value of '0'
% indicates where a file does not have a .m extension.
logicalIndex = ~cellfun('isempty',regexpResult)
% We now index into the file names with the logical index in order to get
% all the .m files:
fileNames(logicalIndex)
% The last three lines of code can be performed in one line of code:
fileNames(~cellfun('isempty',regexp(fileNames,'.\.m\>')))
fileNames =
'.' '..' 'Mfile.m' 'html' 'notM.wat' 'regexpexamp.asv' 'regexpexamp.m'
regexpResult =
[] [] [5] [] [] [] [11]
logicalIndex =
0 0 1 0 0 0 1
ans =
'Mfile.m' 'regexpexamp.m'
ans =
'Mfile.m' 'regexpexamp.m'