Need help with a Regex in PHP (Answered)



  • I'm trying to write a regex in PHP (I know, I know, two TRWTF's already and I haven't even started) and I think I'm close to what I need but can't quite get there.

    This is for my MUD, I'm trying to use a regex with capture groups to parse a use command. This will allow the player to use (possibly with alternate terms instead of use, for example "eat" if it's considered a food item) an item from their inventory, possibly on another player.

    I need three major capture groups. The first matches the action type (no real difference between actions, just some common-sense validation to prevent things that would be generally nonsensical in real life such as "drink bread"), the second match should be the item name, and the third match, if it exists, is the target name, for example if I was attempting to use a trinket or a potion on another player.

    The regex is built at run-time and the last part is populated with the display names of all online players in the same room. So here's an example regex in that format:

    /^(use|eat|drink|break|rub) (.)+ (on (bob|alice|smith|john))?$/
    

    And some example commands that should work:

    use teleport trinket
    use teleport trinket on bob
    drink healing potion
    use healing potion on bob
    break stone tablet
    eat bread

    My problem is that middle capture group in my regex breaks everything. I basically want it to capture EVERYTHING that is not captured by the first or last capture group. How do I do that? Something crazy with backreferences?



  • The Langoliers have you by the throat I suspect. (Or, regex engines are greedy, so there's basically no way to stop capturing without an end delimiter.)



  • doesnt the pumping lemma make regular expressions non-functional with a problem like this?

    forgive me, my language definition knowledge is SUPER rusty, but I know that regular expressions are the simplest lowest level of grammar, and I am pretty sure what you are trying to do goes past it?



  • @mott555 said:

    My problem is that middle capture group in my regex breaks everything. I basically want it to capture EVERYTHING that is not captured by the first or last capture group. How do I do that? Something crazy with backreferences?

    Sounds like you need a non-greedy quantifier. Stick a ? after the + and it should work.

    Also, I’m not completely sure if PHP supports it, but you can put ?: after the opening parenthesis of a group to avoid capturing it (so replacing (on with (?:on will avoid getting a useless "on bob" group in the results)



  • Just a simple tweak, maybe try:

    ^(use|eat|drink|break|rub) (.+?)( on (bob|alice|smith|john))?$

    The question mark after the plus should make it not greedy (not sure if PHP's implementation supports that, but it should). You also need to capture the whitespace before "on", otherwise it breaks matching when you aren't specifying that.

    Also, I'd recommend Debuggex and RegexPal for quickly testing regexes.



  • @ChaosTheEternal said:

    (not sure if PHP's implementation supports that, but it should)

    Last I checked, PHP uses preg (ereg has been deprecated for like 10 years now), which is based on Perl's RegEx's... so it should.



  • @VinDuv said:

    Also, I’m not completely sure if PHP supports it, but you can put ?: after the opening parenthesis of a group to avoid capturing it (so replacing (on with (?:on will avoid getting a useless "on bob" group in the results)

    @ChaosTheEternal said:

    Just a simple tweak, maybe try:

    ^(use|eat|drink|break|rub) (.+?)( on (bob|alice|smith|john))?$

    The question mark after the plus should make it not greedy (not sure if PHP's implementation supports that, but it should). You also need to capture the whitespace before "on", otherwise it breaks matching when you aren't specifying that.

    Looks like both of these fixed it. Thanks guys! And I didn't even have to go over to StackOverflow, misrepresent everything because I don't have enough rep to ask about regexes, and get told to use JQuery!

    For posterity, here's my current example regex:

    /^(use|eat|drink|break|rub) (.+?)(?: on (bob|alice|jim|john))?$/
    

    And at https://www.functions-online.com/preg_match.html

    Doesn't deal with accidental extra whitespace, but I may do a naive loop to replace double-spaces with single-spaces as long as they exist. Probably could be fixed with additional regex magic but I don't know if I want to go that route.


  • Winner of the 2016 Presidential Election

    I think Bob might mind if you used

    drink potion on Bob

    At least I would kindly ask you to get down from me. And by kindly I'd guess I mean

    Use AXE on @mott555

    Filed Under: Might have to seperate those actions more.



  • There will be additional validation code after the basic regex matching, don't worry.


  • :belt_onion:

    @mott555 said:

    There will be additional validation code after the basic regex matching, don't worry.

    Additional validation done in jQuery so it will work, right?


  • Discourse touched me in a no-no place

    @Kuro said:

    I think Bob might mind if you used

    drink potion on Bob

    At least I would kindly ask you to get down from me.


    That's a higher-level validation, and so shouldn't be done in a regex. The basic question that REs answer is “does this string match this pattern (of a particular complexity class that seems to be tractable)” and not whether there was any sense involved in the affair.


Log in to reply