Yahoo Email Phishing through Data URI

Hackers have come up with new and innovative way to perform credential phishing attacks. Typical credential phishing is done by creating look-a-like web pages hosted on compromised web servers or servers owned by the attacker himself. The problem for an attacker is that eventually signature based technologies catch up and black list these pages.

Attackers have found a new way to make phishing attacks more difficult to detect.  For the last couple of months we have seen a rise in phishing attacks that use a content spoofing technique called “DATA URI” in order to hide malicious content.

Data URI Explained

Using the DATA URI technique, an attacker embeds a base64 encoded media BLOB directly within a web page.  Modern browsers that support DATA URI, decode the embedded data and render it within the page.  When used for legitimate pages, this technique allows normally separate elements such as images and style sheets to be fetched in a single HTTP request, which is typically more efficient than issuing multiple HTTP requests.

The structure of a URI can be defined as:



  • “mediatype” is an optional parameter defining the type of media represented by the BLOB. If mediatype is not specified, the DATA URI is assumed to be text/plain
  • “charset” is an optional parameter defining the character set to use for the encoded BLOB. If charset is not specified, the character set is assumed to be ASCII
  • Content type, typically Base64 encoded data.

A simple example of a Base64 DATA URI is given below:


When the above text is entered into into a browser’s address bar the browser will decode the base64 data and display the content (plain text string: ‘hello’ in this case) using the specified media type in the browser’s main window..

Real world Example

Recently the SlashNext Active Cyber Defense System detected a phishing attack at a customer site designed to comprise a user’s Yahoo! email account using this DATA URI technique. The user received a phishing email containing a notification that the user’s Yahoo! email account was about to expire due to inactivity; and that the user needed to sign-in immediately in order to keep the account active.

The phishing URL looked like this:

hxxp://XXfindustries  [.]  com/wp-includes/Text/Diff/Renderer/update/yahoo/start2.htm

When this URL is loaded in a browser, a fake yahoo sigh-in page is presented to the user:


The process of rendering the fake Yahoo! page is accomplished using a series of redirections and data decoding.

Step 1:

The initial redirector page contains a meta tag with a refresh time set to 0 seconds. This page, instead of serving the fake page directly, calls a DATA URI wrapped in a base64 BLOB.


Step 2:

The purpose of the DATA URI is to display an animated popup with a Yahoo! logo that asks the user to wait while the account is prepared for update. This pop up lasts about 5 seconds.  The user is then redirected to a second stage phishing page.




Step 3:

The second stage phishing URL contains yet another DATA URI which embeds the base64 encoded HTML of the final sign-in page.



Step 4:

Once the victim enters his credentials and clicks the “Next Step” button, the victim’s credentials are sent to a third C&C domain though a web form in plain text.


The series of redirections and the use of DATA URI techniques is much more difficult to detect (and therefore more dangerous) than conventional phishing attacks that use plain text media content.  Signature based technologies that scan for common keywords cannot detect these types of phishing pages.  Sandboxes that are not specifically designed to catch these sorts of attacks likewise fail to trap these sorts of pages.

Instead of relying on signature or sandbox based technologies, companies with advanced IT security operations are increasingly turning to technologies that implement dynamic page analysis using cutting edge machine learning algorithms and artificial intelligence.  These new detection techniques successfully trap DATA URI based phishing pages as well as other attacks where legacy products fail.

Leave a Reply

Your email address will not be published. Required fields are marked *