glibc中strlen的实现

　2009-09-10 00:00:00　来源：WEB开发网　　　

核心提示：glibc中的strlen的实现主要的思想就是每次检测4个字节(long int)，这样的话就降低了循环的次数，glibc中strlen的实现，从而从整体上提高了效率，这里它使用了两个技巧，一个是由于传进来的字符串的地址有可能不是4字节(long int)对其的，因此首先需要遍历字符串从而找到4字节对其的那个地址

glibc中的strlen的实现主要的思想就是每次检测4个字节(long int)。这样的话就降低了循环的次数，从而从整体上提高了效率。

这里它使用了两个技巧，一个是由于传进来的字符串的地址有可能不是4字节(long int)对其的，因此首先需要遍历字符串从而找到4字节对其的那个地址。然后再进行比较.

第二个技巧就是如何高效的判断4个字节中是否有字节为0.

下来来看源码，这个源码的注释还是满详细的。这里主要都是一些位计算的技巧：

Java代码

size_t　 strlen　(str)　　　　const　char　*str;　 {　　const　char　*char_ptr;　　const　unsigned　long　int　*longword_ptr;　　unsigned　long　int　longword,　himagic,　lomagic;　　　/*　Handle　the　first　few　characters　by　reading　one　character　at　a　time.　　　　Do　this　until　CHAR_PTR　is　aligned　on　a　longword　boundary.　*/　　　for　(char_ptr　=　str;　((unsigned　long　int)　char_ptr　　　&　(sizeof　(longword)　-　1))　!=　0;　　　　　++char_ptr)　　　if　(*char_ptr　==　'\0')　　　　return　char_ptr　-　str;　　　/*　All　these　elucidatory　comments　refer　to　4-byte　longwords,　　　　but　the　theory　applies　equally　well　to　8-byte　longwords.　*/　　　longword_ptr　=　(unsigned　long　int　*)　char_ptr;　　　/*　Bits　31,　24,　16,　and　8　of　this　number　are　zero.　Call　these　bits　　　　the　"holes."　Note　that　there　is　a　hole　just　to　the　left　of　　　　each　byte,　with　an　extra　at　the　end:　　　　　bits:　01111110　11111110　11111110　11111111　　　　bytes:　AAAAAAAA　BBBBBBBB　CCCCCCCC　DDDDDDDD　　　　　The　1-bits　make　sure　that　carries　propagate　to　the　next　0-bit.　　　　The　0-bits　provide　holes　for　carries　to　fall　into.　*/　　himagic　=　0x80808080L;　　lomagic　=　0x01010101L;　　if　(sizeof　(longword)　>　4)　　　{　　　　/*　64-bit　version　of　the　magic.　*/　　　　/*　Do　the　shift　in　two　steps　to　avoid　a　warning　if　long　has　32　bits.　*/　　　　himagic　=　((himagic　<<　16)　<<　16)　|　himagic;　　　　lomagic　=　((lomagic　<<　16)　<<　16)　|　lomagic;　　　}　　if　(sizeof　(longword)　>　8)　　　abort　();　　　/*　Instead　of　the　traditional　loop　which　tests　each　character,　　　　we　will　test　a　longword　at　a　time.　The　tricky　part　is　testing　　　　if　*any　of　the　four*　bytes　in　the　longword　in　question　are　zero.　*/　　for　(;;)　　　{　　　　longword　=　*longword_ptr++;　　　　　if　(((longword　-　lomagic)　&　~longword　&　himagic)　!=　0)　　{　　　/*　Which　of　the　bytes　was　the　zero?　If　none　of　them　were,　it　was　　　　a　misfire;　continue　the　search.　*/　　　　const　char　*cp　=　(const　char　*)　(longword_ptr　-　1);　　　　if　(cp[0]　==　0)　　　　return　cp　-　str;　　　if　(cp[1]　==　0)　　　　return　cp　-　str　+　1;　　　if　(cp[2]　==　0)　　　　return　cp　-　str　+　2;　　　if　(cp[3]　==　0)　　　　return　cp　-　str　+　3;　　　if　(sizeof　(longword)　>　4)　　　　{　　　　　if　(cp[4]　==　0)　　return　cp　-　str　+　4;　　　　　if　(cp[5]　==　0)　　return　cp　-　str　+　5;　　　　　if　(cp[6]　==　0)　　return　cp　-　str　+　6;　　　　　if　(cp[7]　==　0)　　return　cp　-　str　+　7;　　　　}　　}　　　}　 }