2
votes

What regex should I use to match the following url as the full_path?

https://edition.cnn.com/search\?q\=test\&size\=10\&category\=us,politics,world,opinion,health

The (?:www.|http\:\/\/|https\:\/\/).*} does not work it matches only www. and up until search

    sm := mux.NewRouter()
    sm.HandleFunc("/", getIndex).Methods("GET")
    sm.HandleFunc(`/api/v1/hostname/{full_path:(?:www.|http\:\/\/|https\:\/\/).*}`, URLHandler)

Update the Handler is:

func URLHandler(w http.ResponseWriter, r *http.Request) {
    vars := mux.Vars(r)
    url := vars["full_path"]

    fmt.Fprintf(w, "Full path is: ", url)
}

EDIT2:

This worked for me

sm := mux.NewRouter().SkipClean(true).UseEncodedPath()
sm.HandleFunc("/", getIndex).Methods("GET")
sm.HandleFunc(`/api/v1/hostname/{full_path:.+`, URLHandler)

And on the handler I used url.PathUnescape(r.URL.RequestURI()) and xurls to extract the url

func URLHandler(w http.ResponseWriter, r *http.Request) {
    URL, _ := url.PathUnescape(r.URL.RequestURI())
    ext := xurls.Strict().FindAllString(URL, -1)
    fmt.Fprintf(w, "Full path is: ", ext)
}
1
The URL is a path component. That means it'll be full of escape characters, and any regex you write has to match with escape characters. As an alternative, you can simply use {full_path} in the path and parse the URL yourself to make sure it is valid.Burak Serdar
i tried the {full_path} but it doesnt output anythingOlive.b
How? Show the handler code. Also, show how you're calling itBurak Serdar
I added the handlerOlive.b
How are you registering the path? and how are you calling it?Burak Serdar

1 Answers

2
votes

I think gorilla mux will only use the path part and not the query string of the URL. Try to append RawQuery.

Something like this:

func URLHandler(w http.ResponseWriter, r *http.Request) {
    vars := mux.Vars(r)
    url := vars["full_path"] + "?" 
    if r.URL.RawQuery != "" {
        url += "?" + r.URL.RawQuery
    }
    fmt.Fprintf(w, "Full url is: ", url)
}