0
votes

If I have the following text:

Lorem ipsum dolor sit amet, https://example.com/1aDb_sc-Xy consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat https://example.com/h3ab6--sc3 nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

How can I extract the URL-s from it with PHP? I know that the urls are starting with https://example.com/ and than 10 character are following it.

My goal is to extract an array that will look like this with PHP:

$array = 
[
  'https://example.com/1aDb_sc-Xy',
  'https://example.com/h3ab6--sc3'
];
1
Try preg_match_all('~\bhttps://example\.com/.{10}~', $str, $matches);The fourth bird

1 Answers

2
votes

You can use preg_match_all with regex to search the string for https://example.com/, then up to the first white space after it:


<?php

$string = "Lorem ipsum dolor sit amet, https://example.com/1aDb_sc-Xy consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat https://example.com/h3ab6--sc3 nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";

preg_match_all('!https:\/\/example\.com\/\S+!', $string, $matches);

var_dump($matches[0]);

// array(2) { 
//    [0]=> string(30) "https://example.com/1aDb_sc-Xy" 
//    [1]=> string(30) "https://example.com/h3ab6--sc3" 
// }

Edit: To match https://example.com/ and then the next 10 characters exactly:

preg_match_all('!https:\/\/example\.com\/(.{10})!', $string, $matches);