[C] `strtol(string, NULL, 16);`. Or you could do it the other way...

PJH

Verbatim...

int gethex(char *string)
{
	int val=0;
	int go=1;
	while (go)
	{
		char c=toupper(*string++);
		if ((c>='0' && c<='9') || (c>='A' && c<='F'))
		{
			c=(c>='A') ? (c-'A'+10) : (c-'0');
			val=(val<<4)|c;
		}
		else
		{
			go=0;
		}
	}
	return val;
}

PJH

And just after that, should we use

strtol(string, NULL, 2);

or

int getmask(char *string)
{
	int val=0;
	int i=0;
	while (*string=='0' || *string=='1')
	{
		char c=*string++;
		if (i<32 && c=='1') val|=1<<i;
		i++;
	}
	return val;
}

PleegWat

#define hexchar(c)   ((((c) & 0x1F) + 9) % 25)

PJH

They don't need any encouragement at obfuscation...

anotherusername

That looks like perfectly reasonable code to convert a hex string to a number on an embedded device with no filesystem.

Wait, that's the wrong TDWTF meme. Or is it?

The real issue is that the name is anti-descriptive of what it does...

PJH

It's one of the more fitting.

While the stuff we do is frequently described as 'embedded,' we do have the full set of Standard C libraries available. (And a file system.)

There's no excuse for this code being present. Except, perhaps, ignorance on behalf of the experienced coder who produced it... ("What? There's a function to convert base-n strings to ints?")

dcon

Unittest:
gethex(NULL);

oops...

anotherusername

If that crashes and burns with a null pointer exception, then I'd say it works correctly.

dcon

@anotherusername said:

null pointer exception

OP said "C".

anotherusername

So it seg faults? Still "works correctly".

PleegWat

I think quite a bit of the standard C library isn't NULL-safe either.

NeighborhoodButcher

When I saw [C] I wondered how fast will I see a potential crash. Took me about 2s.

PJH

@dcon said:

Unittest:
gethex(NULL);

oops...

And how would that be different from

strtol(NULL, NULL, 16);

?

So - still functionally the same, albeit, the home-grown version's UB is virtually guaranteed to be to crash, since the implementation is knows, I'm fairly certain the Standard Library implementations don't waste instructions testing for invalid inputs when the inputs are well documented (i.e. caller's responsibility)...

[4.25.x:root@centos tmp]# cat strtol.c 
#include <stdlib.h>
#include <stdio.h>

int main(void){
        return printf("%d\n", strtol(0, 0, 16));
}


[4.25.x:root@centos tmp]# make strtol
cc     strtol.c   -o strtol
[4.25.x:root@centos tmp]# ./strtol
Segmentation fault (core dumped)
[4.25.x:root@centos tmp]#

So - slot-in replacements for each other.

The fact that both crash may be a subject for another thread, but that's not the point here...

RandomStranger

@PleegWat said:

```
#define hexchar(c) ((((c) & 0x1F) + 9) % 25)
So, the hexadecimal value of "WTF?" is 29951 ...

PJH

@RandomStranger said:

So, the hexadecimal value of "WTF?" is 29951 ...

That EBCDIC or ASCII?

PleegWat

The formula is for ASCII. No clue what it does on EBCDIC.

RaceProUK

@PJH said:

@RandomStranger said:
So, the hexadecimal value of "WTF?" is 29951 ...

That EBCDIC or ASCII?

Given the level of

in the original code, it wouldn't surprise me to find it was some weird hybrid of the two

PleegWat

A check on the EBCIDC table on wikipedia (https://en.wikipedia.org/wiki/EBCDIC#Codepage_layout) indicates ((c) & 0x1F) for [0-9A-Fa-f] returns the same values on ASCII and EBCDIC. So the formula will work for both. I would not be surprised if it was written with that intent.

PJH

@PleegWat said:

So the formula will work for both.

Which so totally doesn't answer my question since the letters W and T don't fall into those ranges... .

PleegWat

you're right, sorry. That value would indeed differ

RandomStranger

You forgot to ask "big endian or little endian?"

PJH

@RandomStranger said:

You forgot to ask "~~big endian~~MSB 0 or ~~little endian~~LSB 0?"

HTH, HAND...

dkf

@RandomStranger said:

big endian

Bug endian?

Deadfast

Incidentally, what's up with:

int go=1;
while (go)
{
// ...
		go=0;

Why the hell not just use a break?

dkf

@Deadfast said:

Why the hell not just use a break?

Some people are pathologically allergic to break and continue, thinking that they're a kind of goto and that it means that the compiler becomes unable to analyse the loop.

ProTip: the compiler has no problems at all. You could hew the loop entirely out of gotos and the compiler would be fine with that.

RandomStranger

10 little endians, written to a file...

PJH

@Deadfast said:

Why the hell not just use a break?

Because there wasn't enough obfuscation to begin with?

I dunno. "Why bother writing the function to begin with" would be a more pertinent question.

Deadfast

@PJH said:

Because there wasn't enough obfuscation to begin with?

I dunno. "Why bother writing the function to begin with" would be a more pertinent question.

I'm not defending the function's existence, the reason I brought this up is that I've seen this particular "pattern" a few times before...

PWolff

The variable go is totally superfluous.

Why not ERR_DOING_IT_RIGHT and use c for this purpose too? After all, any proper C string is ended by a zero.

int val=0;
char c=0;
do
{
  val=(val<<4)|c;
  c=toupper(*string++);
  c=((c>='0' && c<='9')||(c>='A' && c<='F'))?(c>='A')?(c-'A'+10):(c-'0'):0;
} while (c);

PJH

@PWolff said:

The variable go is totally superfluous.

Why not ERR_DOING_IT_RIGHT and use c for this purpose too? After all, any proper C string is ended by a zero.
int val=0;
char c=0;
do
{
  val=(val<<4)|c;
  c=toupper(*string++);
  c=((c>='0' && c<='9')||(c>='A' && c<='F'))?(c>='A')?(c-'A'+10):(c-'0'):0;
} while (c);
```</blockquote>

Because `c` is the integer value of the ascii representation - i.e. it terminates the conversion on the first occurrence of a literal `0` in the string.
pjh@pjh-thinkpad:/tmp$ cat ./gethex.c
#include <stdio.h>

int gethex(const char* string){
int val=0;
char c=0;
do
{
val=(val<<4)|c;
c=toupper(*string++);
c=((c>='0' && c<='9')||(c>='A' && c<='F'))?(c>='A')?(c-'A'+10):(c-'0'):0;
} while (c);
return val;
}

int main(void){
const char* strings[]={
"0",
"1",
"9",
"a",
"A",
"A0",
"0A",
"A0A",
0
};
int i=0;
while (strings[i]){
printf("%s -> %d (%d)\n", strings[i], gethex(strings[i]), strtol(strings[i], 0, 16));
++i;
}
return 0;
}
pjh@pjh-thinkpad:/tmp$ ./gethex
0 -> 0 (0)
1 -> 1 (1)
9 -> 9 (9)
a -> 10 (10)
A -> 10 (10)
A0 -> 10 (160)
0A -> 0 (10)
A0A -> 10 (2570)
pjh@pjh-thinkpad:/tmp$

dkf

This post is deleted!

PJH

Ò_o

PWolff

@PJH said:

Because c is the integer value of the ascii representation - i.e. it terminates the conversion on the first occurrence of a literal 0 in the string.

A literal '0' is 48, though.

I've tested it in Visual Studio's C++ - that bastard compiler converts '0' to false!

PJH

@PWolff said:

A literal '0' is 48, though.

No - a literal '0' gets stored in c as 0 because of this bit:

c=((c>='0' && c<='9')||(c>='A' && c<='F'))?(c>='A')?(c-'A'+10):(c-'0'):0;

PWolff

Have noticed that too, meanwhile.

PleegWat

So use -1 as default value instead and change the test to (c<0)?

Unthinking

They mean different things. The long code stores the first bit as 1, the second bit as 2, the third bit as 4, the fourth bit as 8.

The 'strtol' code stores the last bit as 1, the last-but-one bit as 2, the last-but-two bit as 4, the last-but-three bit as 8.

PJH

@PleegWat said:

So use -1 as default value instead and change the test to (c<0)?

I'd rather the code not be used in the first place...

PJH

More from the same program...

	if (result)
	{
		int len=strlen(result);
		char *p=malloc(len+1);
		if (p)
		{
			strcpy(p,result);
			result=p;
		}
		else
		{
			result=NULL;
		}
	}

Instead of

	if (result)
	{
		result = strdup(result);
	}

dkf

I can remember when most platforms didn't have strdup…

PJH

Well since it's POSIX rather than Standard..

In other news I'm modifying some of my own code while also looking at that other program. In the past hour:

[878:root@centos apps]# svn diff -r head | grep "^\-"  | wc -l
2018
[878:root@centos apps]# svn diff -r head | grep "^\+"  | wc -l
60
[878:root@centos apps]# cat <mumble>/*.[ch] <mumble2>/<mumble>.c include/<mumble>.h | wc -l
7102

I've deleted around 1950 lines of source from around 9000. (Obsolete functionality that far too much time was spent on, but never actually got used in the end.)

I'm sure there's still some cruft lying around somewhere though..

dkf

@PJH said:

Well since it's POSIX rather than Standard..

Also, using strcpy instead of memcpy inside the reimplementation is dumb, since you've already definitely got the number of bytes to copy and know that the two buffers are not overlapping…

PleegWat

svn diff | cut -c1 | sort | uniq -c

PJH

@PleegWat said:

svn diff | cut -c1 | sort | uniq -c

Few more changes since last report, but...

[878:root@centos <wibble>-878-4.24.0R-<wobble>]# svn diff | cut -c1 | sort | uniq -c
     20 =
    435  
   2329 -
     61 @
     74 +
     11 C
     20 I
     11 s
[878:root@centos <wibble>-878-4.24.0R-<wobble>]#

What were you looking for?

PleegWat

It's just a one-liner to give you the same basic information. 2329 deletions, 74 insertions in 61 chunks. Been ages since I last used SVN so I'm not sure which lines it has start with =.

JazzyJosh

My favorite part is how it ignores the first character.

Except butts I'm wrong.

PJH

@PleegWat said:

Been ages since I last used SVN so I'm not sure which lines it has start with =.

# svn help st
status (stat, st): Print the status of working copy files and directories.
usage: status [PATH...]

  With no args, print only locally modified items (no network access).
  With -q, print only summary information about locally modified items.
  With -u, add working revision and server out-of-date information.
  With -v, print full revision information on every item.

  The first seven columns in the output are each one character wide:
    First column: Says if item was added, deleted, or otherwise changed
      ' ' no modifications
      'A' Added
      'C' Conflicted
      'D' Deleted
      'I' Ignored
      'M' Modified
      'R' Replaced
      'X' an unversioned directory created by an externals definition
      '?' item is not under version control
      '!' item is missing (removed by non-svn command) or incomplete
      '~' versioned item obstructed by some item of a different kind

Looks like cruft in the output...

# svn diff | grep "^="
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================
===================================================================

Yup.

PleegWat

Nah, it'll be separating something. Possibly files?

PJH

# svn diff | grep "^=" -B3 -A3 | head -n5
Index: apps/include/<wibble>.h
===================================================================
--- apps/include/<wibble>.h       (revision 40006)
+++ apps/include/<wibble>.h       (working copy)
@@ -11,7 +11,7 @@