前言
本文记录一下字符串处理,知识简单,可供参考。主要是字符串截取的相关知识,假如一个中文算两个字符,一个英文或者数字算一个字符,那么如何做字符串截取呢?特别是针对名字过长的时候,需要展示点点点,那么如何优雅的截取字符串呢?
简单方式
const getByteVal=(val, max) =>{
let returnValue = ''
let byteValLen = 0
for (let i = 0; i < val.length; i++) {
if (val[i].match(/[^\x00-\xff]/gi) != null) byteValLen += 2
else byteValLen += 1
if (byteValLen > max) {
returnValue = returnValue + '...'
break
}
returnValue += val[i]
}
return returnValue
}
上面的方法可以针对中英文字符串截取,一个中文顶2个英文或者数字。但是假如字符串中有emoji标签等,例如和😊😂🤣,这种字符串,那么这种方法就会在截取字符串的时候会乱码。因为一个emoji表情算2个字符串,length长度是2,用长度1来累计相加,肯定是不对的。
最齐全的字符串截取方法
const subStringEmoji =(substring, maxLen)=> {
maxLen = maxLen || 10
if (substring) {
let str_cut = new String()
let str_length = 0
for (var i = 0; i < substring.length; ) {
var hs = substring.charCodeAt(i)
let a = ''
if (hs >= 0 && hs <= 128) {
str_length += 1
a = substring.charAt(i)
i++
} else if (0xd800 <= hs && hs <= 0xdbff) {
if (substring.length > 1) {
var ls = substring.charCodeAt(i + 1)
var uc = (hs - 0xd800) * 0x400 + (ls - 0xdc00) + 0x10000
if (0x1d000 <= uc && uc <= 0x1f77f) {
str_length += 2
a = substring.substring(i, i + 2)
i += 2
} else {
str_length += 2
a = substring.substring(i, i + 1)
i++
}
} else {
str_length += 2
a = substring.substring(i, i + 1)
i++
}
} else if (substring.length > 1) {
var ls = substring.charCodeAt(i + 1)
if (ls == 0x20e3) {
str_length += 2
a = substring.substring(i, i + 2)
i += 2
} else {
a = substring.substring(i, i + 1)
i++
str_length += 2
}
} else {
if (0x2100 <= hs && hs <= 0x27ff) {
str_length += 2
a = substring.substring(i, i + 2)
i += 2
} else if (0x2b05 <= hs && hs <= 0x2b07) {
str_length += 2
a = substring.substring(i, i + 2)
i += 2
} else if (0x2934 <= hs && hs <= 0x2935) {
str_length += 2
a = substring.substring(i, i + 2)
i += 2
} else if (0x3297 <= hs && hs <= 0x3299) {
str_length += 2
a = substring.substring(i, i + 2)
i += 2
} else if (hs == 0xa9 || hs == 0xae || hs == 0x303d || hs == 0x3030 || hs == 0x2b55 || hs == 0x2b1c || hs == 0x2b1b || hs == 0x2b50) {
str_length += 2
a = substring.substring(i, i + 2)
i += 2
} else {
str_length += 1
a = substring.substring(i, i + 1)
i++
}
}
//字符串处理
if (str_length > maxLen) {
str_cut = str_cut.concat('...')
break
} else {
str_cut = str_cut.concat(a)
}
}
return str_cut
}
return ''
}
这种方式截取字符串,中英文及emoji表情,全字符串截取。利用Unicode 方式来实现。
扩展
关于emoji表情,其实也是有一些正则判断的,我之前文章有写过,JavaScript RegExp 常用的手机和邮箱正则(常用正则),关于emoji表情正则,特殊字符正则等等,都有。