SubStr vs. MidStr

Discussion specific to NXT-G, NXC, NBC, RobotC, Lejos, and more.
Post Reply
noob4life
Posts: 5
Joined: 17 Jul 2012, 08:55

SubStr vs. MidStr

Post by noob4life »

Hey there!

I'm following your forum for quite a while now and usually found answers, but not this time.

I'm trying to do some networking stuff with several NXTs and therefore wanted to split a string containing flattened values.
For splitting I used SubStr() and had problems with it. MidStr() instead worked as desired. Here's an example:

Code: Select all

task main()
{
 // Just a random array so I can mix up something
 byte l_bTestArray[] = { 0x64, 0x4c, 0x4d, 0x01 };
 
 // Now in a string
 string l_sTestString = ByteArrayToStr(l_bTestArray);
 
 // Now a value I would like to get back in the end
 int l_iValue = 55;
 // Flatten the value
 string l_sFlattenedValue = FlattenVar(l_iValue);
 
 // Combine the two strings
 string l_sComplete = strcat(l_sTestString, l_sFlattenedValue);
 
 // Still works like a charm
 TextOut(0, LCD_LINE1, l_sComplete);
 
 // Now I want the header (l_bTestArray) only
 string l_sHeader = SubStr(l_sComplete, 0, 4);
 
 // Now I want the value (my 'payload') only  - therefore I do some math
 
 // l_iLength will be 2 which is correct
 int l_iLength = (StrLen(l_sComplete) - StrLen(l_sHeader));
 // but the payload string will be empty with SubStr()
 string l_sFlattenedPayload = SubStr(l_sComplete, StrLen(l_sHeader), l_iLength);
 int l_iUnflattenedValue = 0;
 UnflattenVar(l_sFlattenedPayload, l_iUnflattenedValue);

 NumOut(0, LCD_LINE4, l_iUnflattenedValue);
 NumOut(0, LCD_LINE5, StrLen(l_sHeader));
 NumOut(0, LCD_LINE6, StrLen(l_sFlattenedPayload));
 NumOut(0, LCD_LINE7, l_iLength);
 Wait(15000);
}
Result with SubStr():
dLM7


0
4
0
2
Result with MidStr():
dLM7


55
4
2
2
The guide says for SubStr():
string SubStr (string str, unsigned int idx, unsigned int len)
[inline]
Extract a portion of a string. Return a sub-string from the specified input string starting
at idx and including the specified number of characters. The input string parameter
may be a variable, constant, or expression.
and for MidStr():
string MidStr (string str, unsigned int idx, unsigned int len)
[inline]
Copy a portion from the middle of a string. Returns the substring of a specified length
that appears at a specified position in a string.
All in all I can't really tell the difference between the two by this description, but since they do not work the same way there must be something.

Can you help me?

Cheers,

noob4life
noob4life
Posts: 5
Joined: 17 Jul 2012, 08:55

Re: SubStr vs. MidStr

Post by noob4life »

No ideas? hmm

that's sad
h-g-t
Posts: 552
Joined: 07 Jan 2011, 08:59
Location: Albania

Re: SubStr vs. MidStr

Post by h-g-t »

Found this in a VB forum, possibly it applies to NXC as well -


"What is difference between String.Substring and MID function?

Said both performs the same type of operation on string. The basic difference lies, how they indice the string characters.

In case, if you have got string "TEST", the MID function will indice the characters as 1,2,3,4. But with String.substring they will be indiced as 0,1,2,3."
A sophistical rhetorician, inebriated with the exuberance of his own verbosity, and gifted with an egotistical imagination that can at all times command an interminable and inconsistent series of arguments to malign an opponent and to glorify himself.
noob4life
Posts: 5
Joined: 17 Jul 2012, 08:55

Re: SubStr vs. MidStr

Post by noob4life »

h-g-t wrote:Found this in a VB forum, possibly it applies to NXC as well -


"What is difference between String.Substring and MID function?

Said both performs the same type of operation on string. The basic difference lies, how they indice the string characters.

In case, if you have got string "TEST", the MID function will indice the characters as 1,2,3,4. But with String.substring they will be indiced as 0,1,2,3."
Thanks for your answer!

I tried to adjust indices like this:

Code: Select all

string l_sFlattenedPayload = SubStr(l_sComplete, StrLen(l_sHeader)-1, l_iLength);
// should be equal to
 string l_sFlattenedPayload = MidStr(l_sComplete, StrLen(l_sHeader), l_iLength);
But it still gives me different outputs.

Thanks for your help, again!

Probably another solution?

Cheers,

noob4life
afanofosc
Site Admin
Posts: 1256
Joined: 26 Sep 2010, 19:36
Location: Nashville, TN
Contact:

Re: SubStr vs. MidStr

Post by afanofosc »

SubStr and MidStr should result in identical behavior. MidStr is defined like this:

Code: Select all

inline string MidStr(string str, unsigned int idx, unsigned int len) {
  asm { strsubset __STRBUFFER__, str, idx, len  }
}
So all it does is emit the strsubset opcode and return the substring in __STRBUFFER__. The strsubset opcode uses zero-based indexes. The SubStr function is actually implemented in the compiler. It is coded like this in my Delphi code:

Code: Select all

  // SubStr(string, idx, len)
  OpenParen;
  // string
  StringExpression('');
  str := StrBufName;
  MatchString(TOK_COMMA);
  // idx
  BoolExpression;
  push;
  idx := tos;
  EmitLn(Format('mov %s, %s', [idx, RegisterName]));
  MatchString(TOK_COMMA);
  // len
  BoolExpression;
  CloseParen;
  EmitLn(Format('strsubset %s, %s, %s, %s', [StrRetValName, str, idx, RegisterName]));
  pop;
This NXC code:

Code: Select all

  l_sFlattenedPayload = SubStr(l_sComplete, StrLen(l_sHeader), l_iLength);
  l_sFlattenedPayload = MidStr(l_sComplete, StrLen(l_sHeader), l_iLength);
Result in this NBC code:

Code: Select all

	strcat __strbufmain, __main_7qG2_l_sComplete_7qG2_000
	strcat __strbufmain, __main_7qG2_l_sHeader_7qG2_000
	arrsize __D0main, __strbufmain
	sub __signed_stack_001main, __D0main, __constVal1
	mov __D0main, __main_7qG2_l_iLength_7qG2_000
	strsubset __strretvalmain, __strbufmain, __signed_stack_001main, __D0main
	strcat __strbufmain, __strretvalmain
	mov __main_7qG2_l_sFlattenedPayload_7qG2_000, __strbufmain

	strcat __strbufmain, __main_7qG2_l_sComplete_7qG2_000
	mov __MidStr_7qG2_str_7qG2_000_inline_main, __strbufmain
	strcat __strbufmain, __main_7qG2_l_sHeader_7qG2_000
	arrsize __D0main, __strbufmain
	sub __MidStr_7qG2_idx_7qG2_000_inline_main, __D0main, __constVal1
	mov __MidStr_7qG2_len_7qG2_000_inline_main, __main_7qG2_l_iLength_7qG2_000
	strsubset __strbufMidStr_inline_main, __MidStr_7qG2_str_7qG2_000_inline_main, __MidStr_7qG2_idx_7qG2_000_inline_main, __MidStr_7qG2_len_7qG2_000_inline_main
	mov __strretvalmain, __strbufMidStr_inline_main
	strcat __strbufmain, __strretvalmain
	mov __main_7qG2_l_sFlattenedPayload_7qG2_000, __strbufmain
So as you can see from the above NBC code the implementation of the SubStr function in the compiler is broken due to the fact that inside the call to SubStr you are calling another function that uses the same temporary string buffer variable (__strbufmain) as the SubStr function, i.e., when you call StrLen as one of the function parameters.

Sadly, I am afraid that this reveals that several string-related NXC functions that are implemented inside the compiler have this same flaw. Namely, SubStr, StrReplace, StrToNum, StrLen, StrIndex, and FormatNum.

Here, for example, is how you can break FormatNum:

Code: Select all

//  string tmp = FormatNum("testing %d", StrLen(l_sHeader));
	strcat __strbufmain, __constStr0009
	strcat __strbufmain, __main_7qG2_l_sHeader_7qG2_000
	arrsize __D0main, __strbufmain
	sub __D0main, __D0main, __constVal1
	fmtnum __strretvalmain, __strbufmain, __D0main
	strcat __strbufmain, __strretvalmain
And an example of how to break StrReplace:

Code: Select all

//  string tmp3 = StrReplace("testing", 0, StrReplace("foo", 0, "goo"));
	strcat __strbufmain, __constStr0011
	mov __strtmpbufmain, __strbufmain
	strcat __strbufmain, __constStr0012
	mov __strtmpbufmain, __strbufmain
	strcat __strbufmain, __constStr0013
	mov __strretvalmain, __strbufmain
	strtoarr __strbufmain, __strretvalmain
	replace __strretvalmain, __strtmpbufmain, __constVal0, __strbufmain
	strcat __strbufmain, __strretvalmain
	mov __strretvalmain, __strbufmain
	strtoarr __strbufmain, __strretvalmain
	replace __strretvalmain, __strtmpbufmain, __constVal0, __strbufmain
	strcat __strbufmain, __strretvalmain
I will work on fixes for these functions. The work around for these problems is to call the nested function before you call the string function with the defect and store the result in a temporary variable. Like this:

Code: Select all

  int len = StrLen(l_sHeader);
  l_sFlattenedPayload = SubStr(l_sComplete, len, l_iLength);
  string tmp = FormatNum("testing %d", len);
The problem only occurs when you call a function that uses the temporary string buffer variable within the parameter list of another function that uses the temporary string variable. If you avoid this situation then you will not run into this problem with the above named string functions.

John Hansen
Multi-platform LEGO MINDSTORMS programming
http://bricxcc.sourceforge.net/
afanofosc
Site Admin
Posts: 1256
Joined: 26 Sep 2010, 19:36
Location: Nashville, TN
Contact:

Re: SubStr vs. MidStr

Post by afanofosc »

I uploaded a new test release to the test_releases folder on the BricxCC website. The string functions which were not generating correct code should work correctly now.

http://bricxcc.sourceforge.net/test_releases/

John Hansen
Multi-platform LEGO MINDSTORMS programming
http://bricxcc.sourceforge.net/
noob4life
Posts: 5
Joined: 17 Jul 2012, 08:55

Re: SubStr vs. MidStr

Post by noob4life »

Thanks for the information and for the help, John!

Cheers,

noob4life
Post Reply

Who is online

Users browsing this forum: Semrush [Bot] and 1 guest