MS Office Forum / Excel / New Users / February 2007
finding numbers within text
|
|
Thread rating:  |
jasondebolt@gmail.com - 17 Feb 2007 03:09 GMT I'm trying to extract a 15 digit number from a long text string within cells.
Many of the cells in my spreadsheet have lots of text, with sometimes 200-300 words per cell. Each of these cells has one15 digit number that I need to extract and separate from the text. This 15 digit number will then be pasted into a different cell.
Normally, I would do a search for a 6 digit number that begins with 1 within a cell like this:
"This is my text 123456 hello everyone" --> text in cell A1
=MID(A1,FIND("1",A1,1),6) --> formula in cell A2
123456 --> result in cell A2
However, I'm looking for a 15 digit number that doesn't start or end with the same numbers; it's a completely random number. There is nothing before or after the numbers to reference, such as "order #" or "# of days."I don't think there are any wildcard characters in excel exclusivley for numbers. Anyone know how to create a formula for this?
Thanks, Jason
Gord Dibben - 17 Feb 2007 03:14 GMT Jason
You could use this Function on a copy of the cell(s) with numbers/text.
Function DeleteNonNumerics(ByVal sStr As String) As Long Dim i As Long If sStr Like "*[0-9]*" Then For i = 1 To Len(sStr) If Mid(sStr, i, 1) Like "[0-9]" Then DeleteNonNumerics = DeleteNonNumerics & Mid(sStr, i, 1) End If Next i Else DeleteNonNumerics = sStr End If End Function
Usage is: =DeleteNonNumerics(cellref)
Gord Dibben MS Excel MVP
>I'm trying to extract a 15 digit number from a long text string within >cells. [quoted text clipped - 21 lines] >Thanks, >Jason jasondebolt@gmail.com - 17 Feb 2007 03:43 GMT Thanks for the quick reply, you really are an Excel MVP :)
However, it's still not working. The function works for some numbers, but not for the specific 15 digit long numbers that I'm looking for. Do you know why the function recognizes some numbers within text strings, but not all of them? Also, is there a way to limit the numbers that the function spits out to be exactly 15 digits long? The output number has to be exactly 15 digits long, no more or less.
Thanks! Jason
> Jason > [quoted text clipped - 27 lines] > >Normally, I would do a search for a 6 digit number that begins with 1 > >within a cell like this:
> >"This is my text 123456 hello everyone" --> text in cell A1 > [quoted text clipped - 10 lines] > >Thanks, > >Jason Bob Phillips - 17 Feb 2007 11:05 GMT If you are happy with the result being a string, you can use
=MID(A1,MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},A1&"0123456789")),SUMPRODUCT(LEN(A1)-LEN(SUBSTITUTE(A1,{0,1,2,3,4,5,6,7,8,9},""))))
 Signature --- HTH
Bob
(there's no email, no snail mail, but somewhere should be gmail in my addy)
> Thanks for the quick reply, you really are an Excel MVP :) > [quoted text clipped - 54 lines] >> >Thanks, >> >Jason Ron Rosenfeld - 17 Feb 2007 12:37 GMT >If you are happy with the result being a string, you can use > >=MID(A1,MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},A1&"0123456789")),SUMPRODUCT(LEN(A1)-LEN(SUBSTITUTE(A1,{0,1,2,3,4,5,6,7,8,9},"")))) I don't know if it makes a difference to the OP, but your formula does not differentiate between numbers that contain 15 digits, and those that contain fewer or more. --ron
Bob Phillips - 17 Feb 2007 14:14 GMT Hi Ron,
I wasn't sure from his post whether he only wanted 15 digit numbers amongst others, or they would all be 15 digits. From his post I took it that he wanted all numbers, he just couldn't use his formula because they didn't all start with 1.
 Signature --- HTH
Bob
(there's no email, no snail mail, but somewhere should be gmail in my addy)
>>If you are happy with the result being a string, you can use >> [quoted text clipped - 5 lines] > fewer or more. > --ron Ron Rosenfeld - 17 Feb 2007 18:35 GMT >Hi Ron, > >I wasn't sure from his post whether he only wanted 15 digit numbers amongst >others, or they would all be 15 digits. From his post I took it that he >wanted all numbers, he just couldn't use his formula because they didn't all >start with 1. Yes, it was a bit hard to tell.
I assumed that he just wanted a 15 digit number, and that there would be only one 15 digit number in the string. So my regular expression solution only returns numbers that are exactly 15 digits in length. And by definition, there would have to be a "non-number" before and after.
--ron
Ron Rosenfeld - 17 Feb 2007 04:32 GMT >I'm trying to extract a 15 digit number from a long text string within >cells. [quoted text clipped - 21 lines] >Thanks, >Jason You could use this UDF which makes use of Regular Expressions.
<alt-F11> opens the VB Editor.
Ensure your project is highlighted in the Project Explorer window, then Insert/Module and paste the code below into the window that opens.
You must also set a reference to Microsoft VBScript Regular Expressions 5.5 (Under Tools/References).
===================================================== Option Explicit
Function Get15(str As String) As String Dim objRegExp As RegExp Dim objMatch As Match Dim colMatches As MatchCollection Const Pattern As String = "(^|\D)(\d{15})(\D|$)"
' Create a regular expression object. Set objRegExp = New RegExp
'Set the pattern by using the Pattern property. objRegExp.Pattern = Pattern
'Set global applicability. objRegExp.Global = True
'Test whether the String can be compared. If (objRegExp.Test(str) = True) Then Set colMatches = objRegExp.Execute(str) Set objMatch = colMatches.Item(0) Get15 = objMatch.SubMatches(1) End If End Function ====================================================
"Pattern" sets the specifications. The Pattern that needs to be matched consists of:
1. Either a non-Digit character or the start of the string followed by 2. any 15 consecutive digits followed by 3. a non-Digit character or the end of the string
To use this, enter the UDF =Get15(str) where str is either a string or a cell reference containing the string. --ron
Miyahn - 17 Feb 2007 05:23 GMT > You must also set a reference to Microsoft VBScript Regular Expressions 5.5 Here is a late binding version. (This can use without reference setting.)
Option Explicit Function GetNumber(Src As String, Optional Digit As Long = 0) As String Application.Volatile With CreateObject("VBScript.RegExp") Select Case Digit Case 0: .Pattern = "\d+" Case Else: .Pattern = "(?:^|\D)(\d{" & CStr(Digit) & "})(?=\D|$)" End Select If .Test(Src) Then GetNumber = .Execute(Src)(0).Submatches(0) End If End With End Function
 Signature Miyahn Microsoft MVP for Microsoft Office - Excel(Jan 2004 - Dec 2007) https://mvp.support.microsoft.com/profile=e971f039-a892-426c-9544-83d372c269b4
Ron Rosenfeld - 17 Feb 2007 12:06 GMT >> You must also set a reference to Microsoft VBScript Regular Expressions 5.5 > [quoted text clipped - 14 lines] > End With >End Function If there are many fields to process, won't late binding make for a longer execution time? --ron
Miyahn - 18 Feb 2007 11:33 GMT > If there are many fields to process, won't late binding make for a longer > execution time? Yes. In my environments, the late binding version takes a little longer time (+1.2%) to execute.
Option Explicit Function GetNumberLB(Src As String, Optional Digit As Long = 0) As String Application.Volatile With CreateObject("VBScript.RegExp") Select Case Digit Case 0: .Pattern = "\d+" Case Else: .Pattern = "(?:^|\D)(\d{" & CStr(Digit) & "})(?=\D|$)" End Select If .Test(Src) Then GetNumberLB = .Execute(Src)(0).Submatches(0) End If End With End Function ' Function GetNumberEB(Src As String, Optional Digit As Long = 0) As String Application.Volatile With New RegExp Select Case Digit Case 0: .Pattern = "\d+" Case Else: .Pattern = "(?:^|\D)(\d{" & CStr(Digit) & "})(?=\D|$)" End Select If .Test(Src) Then GetNumberEB = .Execute(Src)(0).Submatches(0) End If End With End Function '
Sub PerformanceTest() Const DummySrc = "ABCD123456EFG" Dim S As Single, I As Long, DummyResult As String S = Timer For I = 1 To 10000 DummyResult = GetNumberLB(DummySrc, 6) Next I Debug.Print "Late Binding: " & Timer - S S = Timer For I = 1 To 10000 DummyResult = GetNumberEB(DummySrc, 6) Next I Debug.Print "Early Binding: " & Timer - S End Sub
 Signature Miyahn Microsoft MVP for Microsoft Office - Excel(Jan 2004 - Dec 2007) https://mvp.support.microsoft.com/profile=e971f039-a892-426c-9544-83d372c269b4
jasondebolt@gmail.com - 17 Feb 2007 23:14 GMT > > You must also set a reference to Microsoft VBScript Regular Expressions 5.5 > [quoted text clipped - 18 lines] > Miyahn > Microsoft MVP for Microsoft Office - Excel(Jan 2004 - Dec 2007)https://mvp.support.microsoft.com/profile=e971f039-a892-426c-9544-83d... It works! Thank you :)
jasondebolt@gmail.com - 17 Feb 2007 23:15 GMT > > You must also set a reference to Microsoft VBScript Regular Expressions 5.5 > [quoted text clipped - 18 lines] > Miyahn > Microsoft MVP for Microsoft Office - Excel(Jan 2004 - Dec 2007)https://mvp.support.microsoft.com/profile=e971f039-a892-426c-9544-83d... Thanks!!
T. Valko - 17 Feb 2007 04:41 GMT Do any numbers have leading 0's:
012345678901234
This formula will extract the number string but will drop any leading 0's:
=LOOKUP(99^99,--MID(A1,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A1&"0123456789")),ROW(INDIRECT("1:"&LEN(A1)))))
Excel insists on formatting the result as SCIENTIFIC so you'll want to format as NUMBER 0 decimal places. Or, if you do have leading 0's you can use a custom format of: 000000000000000
Biff
> I'm trying to extract a 15 digit number from a long text string within > cells. [quoted text clipped - 21 lines] > Thanks, > Jason Bernd - 17 Feb 2007 16:15 GMT Hi Jason,
A suggestion: Function XtractNum(s As String) As Double Dim i As Long, d As Double, c As String, f As Double
i = 0: d = 0#: f = 1#
StateStart: i = i + 1 If i > Len(s) Then GoTo StateEnd c = Mid(s, i, 1) Select Case c Case "0" To "9" d = CDbl(c) GoTo StatePreComma Case "." d = 0# GoTo StatePostComma Case Else GoTo StateStart End Select
StatePreComma: i = i + 1 If i > Len(s) Then GoTo StateEnd c = Mid(s, i, 1) Select Case c Case "0" To "9" d = 10# * d + CDbl(c) GoTo StatePreComma Case "." GoTo StatePostComma Case Else GoTo StateEnd End Select
StatePostComma: i = i + 1 If i > Len(s) Then GoTo StateEnd c = Mid(s, i, 1) Select Case c Case "0" To "9" f = f / 10# d = d + CDbl(c) * f GoTo StatePostComma Case Else GoTo StateEnd End Select
StateEnd: XtractNum = d
'Small specialty: Accept only 15 digit numbers: If Len(CStr(d)) <> 15 Then XtractNum = CVErr(xlErrNum) End If
End Function
Please notice that leading zeros will be swallowed and that the decimal point will be counted as a digit, too. Change the "small specialty" at the end of my UDF if necessary.
Regards, Bernd
|
|
|