You are here

Computer News - IT news, business technology, reviews

Intel graphics cards release date, news and rumors

For the longest time, the best graphics cards have been an all-out war between Nvidia vs AMD. However, there have been rumors floating around for a while now that Intel is building a team of GPU talent to jump in on the action. In a marketplace where Nvidia is charging exorbitant prices for the RTX 2080 Ti and RTX 2080 and AMD is focusing on the budget market with the RX 590, we’re ready for a third party to jump into this war and spice up the market again.

If Intel approaches the GPU space targeting the same audience it does with its processors — trying to dominate the high-end, it could finally give Nvidia some real competition and force them to drive prices down. Still, we don’t know what these new graphics cards are going to look like, or when we’re going to see them. We don’t even know if they’ll ever actually exist.

But, that doesn’t mean we can’t do a bit of speculation. There has been enough movement out there on the Intel GPU front to pull all the information together and try to create a kind of wish list for what we’d like to see out of Intel’s GPUs. So, keep this page bookmarked, as we’ll keep it updated with any and all information that comes our way.

Cut to the chase
  • What is it? Intel’s rumored line of graphics cards
  • When is it out? Sometime in 2020
  • What will it cost? No one knows yet
Intel graphics cards release date

Intel graphics cards seem set for release sometime in 2020 and that’s the only thing we can be sure about. The chipmaker itself has stated that it’s on track to release graphics cards in 2020 at least twice. We’re not sure of a much more accurate date beyond that, unfortunately. We’re hoping it’ll come sooner at the start of 2020. 

The way we see it playing out is Intel will announce its graphics architecture at CES 2019, with enterprise cards hitting the market. Then, we could see the company holding a separate launch event for consumer graphics cards when they’re ready to hit the market, similar to what we saw with Coffee Lake Refresh. Of course, it could work the other way around — but if Intel’s own teaser video is to be believed, it’s not going to play out like that.

Either way, we don’t know the exact release date for Intel’s graphics card until the company wants us to (or until the release date gets leaked, as is likely to happen).


Intel graphics cards price

Intel’s pricing for its graphics cards is ultimately going to boil down to what segment of the consumer market it plans to capitalize on it. We’re sure that there will be professional and datacenter GPUs that cost thousands of dollars, but we’re more interested in consumer or gaming graphics cards. 

If Intel decides to compete with AMD, we could see it start out with mid-range cards priced around $300 (about £230, AU$420) that give the Radeon RX 5xx series a run for its money. This could be compelling because Nvidia doesn’t have any current-generation cards in this range, and who knows what the GTX 2060 is going to cost.

What we think is more likely, though, is that Intel will target the high-end and enthusiast market first — similar to what it’s doing with Coffee Lake Refresh. We could see Intel go all-in, trying to compete with the Nvidia GeForce RTX 2080 Ti or the RTX 2080, undercutting them by a couple hundred bucks and succeeding. Especially if it’s able to pack in enough traditional GPU power — we doubt that Intel will be able to compete with Nvidia Turing’s more unique ray tracing and Tensor Core-powered AI features. 

At the end of the day, we don’t know what Intel is doing here, but we’re excited nonetheless. We’ll be keeping our ears to the ground on this one, waiting until more information starts surfacing — so stay tuned.

Intel graphics cards specs

Usually, this is the part of the story where we dive into past releases and try to suss out what the future products are going to look like. But, we can’t really do that this time around — it’s been almost two decades since Intel has released a discrete GPU, and that didn’t end so well for team blue.

So, instead, we’re going to dive into some speculation based on Intel’s teaser video and a bit of what’s going on in the scene these GPUs will be entering.

In Intel’s teaser, wherein it claims to ‘set your graphics free’ – whatever that means – Intel reminds us that it not only created the first GPU capable of handling 4K Netflix, but also the first fully DirectX 12 compliant GPU and a gaming PC that’s ‘as thin as a phone’. And, with these claims, and considering the talent it’s poached, like ex-AMD graphics guru Raja Koduri, it looks like Intel is going to try and push the envelope a bit. 

Whether that means it’ll support fancy rendering techniques like Nvidia is with the RTX cards, we don’t know, but Nvidia could certainly use some competition at the high-end, and we might see Intel make a run for it. 

At the end of the day, there are only a couple things we need to see in these new GPUs, they need to be capable of 4K gaming, and they need to be priced competitively. If Intel is able to hit these two marks, we could see Intel competing in the bloodthirsty GPU marketplace. But, we won’t know until Intel is ready to share – we can’t wait for CES 2019.

The best gaming mouse for Fortnite

In the mad dash to win a Fortnite Victory Royale, you're going to do a lot of looking around. Whether you're keeping your character's head on a swivel to make sure no one is trying to snipe you from a distance, quickly selecting loot, or perfecting flick shots on your enemies, you need to be able to rely on your mouse. That's why you'll want to have the best mouse for Fortnite. 

At the heart of any great gaming mouse is a good sensor that keeps up with your quick hand movements for perfectly tracked aim. Ever seen a pro land a great shot just an instant after spotting an enemy? That's muscle memory, knowing just how far they need to move their hand to get the crosshairs on the target, and they're relying on their mouse's sensor to be as consistent as they are.

Of course, after the sensor, you'll also need to have a mouse that performs reliably in other respects, sending your inputs as quick as possible. A few extra buttons can also help your create shortcuts for your favorite gear slots of building needs. With all that in mind, we've picked out the best mice you can get for Fortnite, whether you need wired, wireless, or just a good deal.

It should come as no surprise that the best mouse in gaming also helps you play play Fortnite at the highest level. The key aspect of the SteelSeries Rival 600 is the TrueMove 3+ sensor. There's little more important to aim than your own skill and the mouse sensor you've got. You want your hand's motion to be translated perfectly into the game, and the TrueMove 3+ sensor can do that while also ensuring that it doesn't track motion when you pick the mouse up.

As if that best-in-class sensor wasn't enough, the Rival 600 has a great design, with dazzling lights in eight separate zones. Silicone grips make it easy to hold, and the split-trigger buttons offer a reliable clicking experience. The three side buttons are just a nice little extra, giving you an easy option for mapping the keybinds for your construction or key weapon slots.

Read the full review: SteelSeries Rival 600 

If you have a busy desk space, more wires may be the last thing you want. And, with the possibility of mouse wire snagging and messing up your aim, it's reasonable to consider a wireless gaming mouse. It's especially reasonable when they can come as good as the Corsair Dark Core RGB SE. It offers a sensor with sensitivities up to 16,000 DPI, along with a 1,000Hz polling rate and 1ms response times for performance on par with a wired mouse. And, with Qi wireless charging, you never have to plug this mouse in.

The design combines plastics and texted soft touch materials for a comfortable and sure grip. You'll also get plenty in the way of buttons for custom mapping. Beyond the primary mouse buttons and scroll wheel, there's a profile switch button, two DPI buttons, front and back buttons, and a sniper button that can lower the sensitivity instantly for fine-tuned aiming. Although Corsair's customization software isn't the easiest to use, it will let you map the Dark Core RGB SE's buttons to perform different functions, and then you can map them to key controls in Fortnite.

Read the full review: Corsair Dark Core RGB SE 

If you don't want to spend a lot of money but still want a mouse that will perfectly track your mouse movements, this is it. The SteelSeries Sensei 310 features the same TrueMove 3 optical sensor that makes the Rival 600 such an excellent performer. It just lacks the second sensor that gives the more expensive Rival its lift-off detection. If you can live without the second sensor, then you can get a great value on the Sensei 310 with 1-to-1 tracking on settings up to 3,500 DPI.

The look of the Sensei 310 is fairly simple, with an ambidextrous design that sees two thumb buttons included on both sides of the mouse. The primary mouse buttons are split from the body for a reliable and consistent click. And, the scroll wheel has a textured silicone for a sure grip. Similarly, the sides feature soft, textured silicone, so you'll be able to pick the mouse up in the most frantic moments at the end of a Fortnite match without fumbling and missing that game-winning shot. If you don't need the ambidextrous design, the Rival 310 performs the same but with a right-handed design.

Read the full review: SteelSeries Sensei 310 

The 15 best White Elephant tech gifts

Let’s review the situation: you’ve got a holiday parting up at your workplace or among friends, and everyone’s tasked with bringing one baffling, jokey item. You, brave holiday shopper, must find a perfect White Elephant gift.

What is White Elephant? Just keep in mind that these are not ideas for Secret Santa, which is a much more thoughtful tradition about getting just the right gift for someone. No, White Elephant (or Yankee Swap or Dirty Santa, as the kids apparently call it) is about wrapping bizarre gifts and gleefully waiting to see the dumbstruck look on whoever opens your oddball item.

Here are 15 things that fit the bill – and because we’re TechRadar, they’ve all got a techie spin. We’ll stick to things between $20-$30, which is the usual limit for White Elephant swaps.

But for all you overachievers, we’ve included a few ridiculous items at the end that exceed that price threshold if you spare no expense on gag gifts.

The top White elephant gifts Pricey, but still perfect

The best VR movies: experience the next era of storytelling

Virtual reality (VR) is on the rise, and while apps and games get a ton of press, there are plenty of VR movies to enjoy, too.

VR movies are amazing for a number of reasons.

For starters, they’re much more immersive than their non-VR counterparts – thanks, of course, to the fact that you can watch many of them in 360 degrees.

That can make them much more interesting – plus, you’re likely to see something new every time you watch, thanks to the fact that you can look all around you while watching.

But, not all VR movies are really worth watching.

After all, many of them are underdeveloped, and even those with backing may suffer from poor storytelling or other issues. 

There’s also the fact that there really aren’t any full-length feature films in VR – so you’ll be limited to watching short films. 

Regardless, here are the best VR movies today.

Invasion

Looking for a fun experience that kids can enjoy just as much as adults? Invasion is an Emmy award-winning short film filled with color. It’s about two aliens with dreams of taking over the world, but, when they get here, they’re greeted by two adorable little bunnies. Narrated by Ethan Hawke and featuring beautiful visuals and well-designed animation, Invasion is a short film any VR-lover can enjoy. 

Ashes to Ashes

Ashes to Ashes may be a little more adult than Invasion, but it’s still definitely worth checking out. The film tells the story of a dysfunctional family as they struggle through the loss of a grandfather, whose last wish was to have his ashes blown up. In the film, you’ll take the perspective of the urn, which allows you to passively observe the story as it takes place around you. The film also exposes the behind-the-scenes process, by showing the crew filming the movie, and even showing the actors out of character at times.

It: Float

This one is for the horror fans out there. The modern retelling of Stephen King’s It was one of the biggest horror movies ever made and, in celebration of the film, a VR experience was also released. The VR film brings the viewer to the clown Pennywise’s abode, with newly horrifying details around every turn. Safe to say, this one isn’t for the faint of heart or young kids – but if you were a big fan of It, then it’s definitely one to check out.

The Conjuring 2 – Enfield 360 Experience

Here’s another one for the horror fans. This VR experience was made to promote the release of The Conjuring 2, and in it you’ll enter the Hodgson’s house and experience the terror of the Enfield Haunting for yourself. At the start of the film, you’ll be briefly greeted by director James Wan, after which you’ll head into the home. Short after, the lights begin to flicker on and off, objects on the wall spin around and more. You’ll quickly find that you’re an unwelcome guest in the home. As with the previous movie, it’s safe to say this isn’t a film for those who get scared easily. 

The Invisible Man

The Invisible Man follows more of a story than many of the other VR films on this list. In the film, you’ll observe low-level drug traffickers Nick and Kid, who secretly have a stash of high-value drugs hidden in a barn. Unfortunately for them, they also owe a debt to Frank, who suddenly shows up to their hideout – and insists that they settle the score with a game of Russian Roulette. It’s a slightly scary film, but all of the questions you have should be answered by the end – so it’s worth watching the whole thing.

Supported content on TechRadar means the article has been created in partnership with a developer, publisher, manufacturer or other relevant party. When you see this disclosure note in an article, it means that the article idea has been approved by another company – a developer, hardware maker, or publisher – but that otherwise the content is planned, written, and published by TechRadar without any further approval. This is distinct from sponsored content on TechRadar, which is created entirely by a third party, and not the TechRadar editorial team.

Yes, the Xbox Adaptive Controller is innovative - if you can afford the added expenses

Back in September Microsoft released the Xbox Adaptive Controller, an innovative new controller for Xbox One and Windows PC designed specifically to make gaming more accessible for those with disabilities. But is it truly usable by individuals or simple for their carers to set up without assistance from a charity? 

The Xbox Adaptive Controller was designed for those with disabilities, therefore it wouldn’t be suitable for me to test its capabilities. After putting out a call on Twitter asking if any gamers with disabilities would like to try out the controller, I was put in contact with Mark Fox: an avid gamer and software developer. 

Mark has Arthrogryposis Multiplex Congenita (AMC) – a musculoskeletal disease which results in decreased flexibility of joints due to multiple joint contractures throughout the body. In Mark’s case, AMC affects all his tendons and joints – making them all either slightly or very short. 

This means that he experiences cramps and pain from playing games too long (due to the shape of the controller) and tires from exertion. On occasions when he plays for too long, Mark’s joints will lock up causing his hands to get stuck in the shape of the controller. 

Despite this, Mark has always been a big gamer - though he finds it “impossible” play PlayStation titles as the PlayStation 4's DualShock 4 controllers are too small, causing him intense pain. 

Mark was eager to try out the Xbox Adaptive Controller in hopes it would make gaming more accessible for him, possibly leading him to purchase the device for himself in the future. We invited him over to TechRadar HQ, and we put the controller through its paces.

What does the Xbox Adaptive Controller look like?

Microsoft kindly supplied us with an Xbox Adaptive Controller to test, along with some input devices: a foot pedal, one-handed joystick and a pressure-sensitive contact point. When you buy the controller, it doesn’t come with these input devices – you only get the main controller itself which costs $99.99 (£74.99/AU$129.99 ). 

You have to pay extra for the input devices, but we’ll come back to that. Straight out of the box, you get the controller and a USB cable so it’s simple to connect to the Xbox One. You can either have it wired by the USB or use the controller wirelessly much like the general Xbox One controller (just hold the Xbox button on the controller to connect it).

So what does the Xbox Adaptive Controller actually look like? It comes in white and is roughly the size of a small keyboard, measuring 92mm (L) x 130mm (W) x 23mm (H). There are two large black buttons (about the size of coasters) in the middle - one being the A button and the other being B. There are then some smaller buttons to the left of these: the Xbox button, view button, menu button and a shift button, alongside a D-pad. 

There are 19 3.5mm ports and two USB 2.0 ports for external inputs, alongside one 3.5mm stereo headset jack for audio.

Remap management

Mark tells me he prefers playing AAA titles, so we decide to put the controller through its paces with Assassin’s Creed Odyssey to start. But our immediate issue is trying to work out how to map the inputs to particular buttons, after some confusion between us, and fumbling with the Xbox menu, we find the mapping settings and Mark chooses which buttons he wants to map to the input devices: the one-handed joystick for movement, the pressure control for Y and the foot pedal for right trigger.

"This is something that should be in the top tier menus,” Mark points out. “So that when you're playing a game, if you just hit the game button, you should be able to get to the mapping menu."

Lucky Mark is a seasoned gamer and knows about mapping, but that isn’t necessarily the case for everyone who may use the controller or their carers. We had to consult the Xbox Adaptive Controller FAQs to work out how exactly to do it.

Mark soon gets to grips with movement and using the various buttons, praising the one-handed joystick for providing him with an easier way of movement. However, it isn’t long before he notices an issue. “I don’t have X,” he states. Back into the menu and Mark changes his right foot pedal to X, meaning he now lacks trigger buttons - accessing one button means forfeiting another. It becomes apparent that, for a game such as this, an input device would be needed for every button (and we had already been provided with three).

Excluding the D-pad, Xbox button, menu button and view button, the original Xbox controller has 10 buttons (including the two analogue sticks). The Xbox Adaptive Controller - straight out of the box - comes with only A and B. This an input device is required for the other devices - and you need to buy each own separately. So if you ideally wanted use of all the same buttons at the same time, without swapping out buttons mid play, you would need to buy eight inputs. 

"You would need at least two toggle controllers or the equivalent and a button for Y and X,” Mark explains. "It should come with enough inputs to at least provide four buttons, whether that is external or built-in. They don't have to be the most expensive, high-quality, just as long as the basic functionality is there. Then people could pay extra for higher quality versions of things they specifically need."

Saying that, there is the option to create individual mapping profiles and to shift between button settings - simply pressing the shift button on the main controller to switch profiles. However doing so (and working out how to do so efficiently) proves a struggle. 

After remapping the control to allow himself to use X (the combat button in Assassin’s Creed Odyssey), Mark encounters a new issue. The one-handed joystick is used to move but the buttons on it LB and left trigger aren’t being picked up. We go back into the mapping menu. The input device only picks up one button per input, so he holsters his plans to try out the bow - his usual choice of weapon. 

"It seems to be aware of the stick, but it's not aware that it has any buttons,” he vents. "It should come with the bare-bones basics you need to actually use it. Out of the box, it's not a usable product - except for really basic Xbox arcade stuff."

Racing ready?

Acknowledging that Assassin’s Creed Odyssey was perhaps to complex a game to begin our test with, we move onto another - Forza Horizon 2. Mark remaps the controller again for the game, making use of the right pedal as an accelerator. This time, Mark finds the controls easier to handle - using just the pedal and toggle to navigate the coastal roads. "It feels as easy and natural as any other racing game I've ever played,” he beams. 

Mark’s only issue is the camera panning, pointing out that a game would need to have automatic camera panning otherwise another input would be required - the same issue arose with Assassin’s Creed Odyssey.

As he races along, I ask Mark his overall feeling on the Xbox Adaptive Controller and whether he, personally, would purchase it.

"People with disabilities often tend to have lower income,” Mark explains. “I know it's not always the case - and I wouldn't want to generalize. I'm lucky that I can work from home so I can afford a hobby, bear in mind a regular Xbox controller is anywhere from £50 up. I can afford it but it would still give me pause. It risks being prohibitively expensive." 

"The other risk is that the inaccessibility of it puts more strain on the charities because more people turn to the charities because there's no way they can reasonably afford it themselves,” Mark continues. 

So if it came with some external inputs, would that change his mind? “There should be enough inputs [on the Xbox Adaptive Controller] that you can play the average Xbox game,” Mark tells me. “There's going to be some games which have a million controls and a lot going on where it's not practical to cater straight out of the box. But it should cater to your basic big headline games like your Fallouts and your Halos.”

Gamers' charity

SpecialEffect is a British based charity which aims to “put fun and inclusion back into the lives of people with physical disabilities by helping them to play videogames.” How do they do this? By assessing the needs of those with disabilities and using technology ranging from eye-control to joypads to help them access games. 

After testing the controller with Mark, I spoke to SpecialEffect’s communications support Mark Saville and project manager Bill Donegan about the Xbox Adaptive Controller - which SpecialEffect helped design and test.

“It’s so dependent on people’s abilities whether they’ll find it useful or not,” Saville explains when I tell him of my test with Mark. “One man’s meat is another man’s poison.” 

During development, SpecialEffect tested the Xbox Adaptive Controller with a range of disabilities including spine injuries, cerebral palsy and muscular dystrophy. “We tried to get a range of the types of people that we might work with,” Donegan explains. “But within that everyone is so different, it's hard to get as wide a range as possible.”

According to SpecialEffect, Microsoft did not create the Xbox Adaptive Controller with just one disability or specific adaptation in mind; instead aiming to cater to as many different needs as possible.

“Even from the first prototype, we saw how much they were trying to fit in one product,” Donegan tells me. “There didn't seem to be much compromise when it came to leaving things out. From that point of view, they're completely on the right track from our perspective.

“The feedback for us has been really great. It's part of our kit, so we use it alongside lots of other equipment. There's lots of equipment which is compatible with it and other equipment is completely separate to this. But we're finding we're using it very frequently. I think part of that would be the fact it was made first-party for their console and for windows, so it's going to be supported. It's almost plug-and -play in terms of actually connecting things up, which is a big change for us because a lot of the hard stuff is getting the equipment to work for a specific person but there's another challenge in getting that equipment just to work with the console for instance - because it's not official.”

"We have had other equipment which connects to Xbox before,” Saville interjects. “But none have done it in quite such a clear way and that's what this controller offers.”

But the problem is, it wasn’t that clear - not for me and Mark anyway - and I wanted to know if SpecialEffect had received similar feedback.

“Interestingly when it first came out their was a wave of positive news saying 'this makes gaming so much more accessible to so many more people' and yes, it is a tool for doing that but there's still the physical gap between the controller and the body,” Saville laughs. “Since it's come out, we have had those questions come in about how to use it.”

That’s why SpecialEffect has created a range of YouTube videos (released after our test) to show people how to create profiles, use the controller efficiently and map inputs. However, the controller is potentially still overly complicated for individuals not aided by the charity. In addition, it is hard to overlook the mounting price of buying various input devices.

“It's certainly not a magic bullet,” Saville agrees. 

“I think it's a slow burner and as the community comes up with solutions themselves, and start sharing those, that'll feed peoples' ideas of what they might be able to try for themselves,” Donegan explains. “Hopefully more peripherals will be made for it and there will be more options. The controller has a lot of flexibility which means in turn that there's lots of options and knowing which option is for you is the difficult part.

“What we're sharing is how we're using it from our perspective and hopefully we'll pick up on how other people are using it and it'll give us some ideas.”

CatBoost Enables Fast Gradient Boosting on Decision Trees Using GPUs

Machine Learning techniques are widely used today for many different tasks. Different types of data require different methods. Yandex relies on Gradient Boosting to power many of our market-leading products and services including search, music streaming, ride-hailing, self-driving cars, weather prediction, machine translation, and our intelligent assistant among others.

Gradient boosting on decision trees is a form of machine learning that works by progressively training more complex models to maximize the accuracy of predictions. Gradient boosting is particularly useful for predictive models that analyze ordered (continuous) data and categorical data. Credit score prediction which contains numerical features (age and salary) and categorical features (occupation) is one such example.

Gradient boosting benefits from training on huge datasets. In addition, the technique is efficiently accelerated using GPUs. A large part of the production models at Yandex are trained on GPUs so we wanted to share more insights on our expertise in this area.

Let’s look more closely at our GPU implementation for a gradient boosting library, using CatBoost as the example. CatBoost is our own open-source gradient boosting library that we introduced last year under the Apache 2 license. CatBoost yields state-of-the-art results on a wide range of datasets, including but not limited to datasets with categorical features.

Gradient Boosting on Decision Trees

We’ll begin with a short introduction to Gradient Boosting on Decision Trees (GBDT). Gradient boosting is one of the most efficient ways to build ensemble models. The combination of gradient boosting with decision trees provides state-of-the-art results in many applications with structured data.

Let’s first discuss the boosting approach to learning.

Developers use these techniques to build ensemble models in an iterative way. On the first iteration, the algorithm learns the first tree to reduce the training error, shown on left-hand image in figure 1. This model usually has a significant error; it’s not a good idea to build very big trees in boosting since they overfit the data.

Figure 1. First and second trees usually include significant error

The right-hand image in figure 1 shows the second iteration, in which the algorithm learns one more tree to reduce the error made by the first tree. The algorithm repeats this procedure until it builds a decent quality mode, as we see in figure 2:

Figure 2. N-th tree

Gradient Boosting is a way to implement this idea for any continuous objective function. The common approach for classification uses Logloss while regression optimizes using root mean square error. Ranking tasks commonly implements some variation of LambdaRank.

Each step of Gradient Boosting combines two steps:

  1. Computing gradients of the loss function we want to optimize for each input object
  2. Learning the decision tree which predicts gradients of the loss function

The first step is usually a simple operation which can be easily implemented on CPU or GPU. However, the search for the best decision tree is a computationally taxing operation since it takes almost all the time of the GBDT algorithm.

Decision tree learning

Dealing with ordered inputs in GBDT requires fitting decision trees in an iterative manner. Let’s discuss how it is done. For simplicity, we’ll talk about classification trees, which can be  easily described.

Classification and regression decision trees are learned in a greedy way. Finding the next tree requires you to calculate all possible feature splits (feature value less than some predefined value) of all the features in the data, then select the one that improves the loss function by the largest value. After the first split is selected, the next split in the tree will be selected in a greedy fashion: the first split is fixed, and the next one is selected given the first one. This operation is repeated until the whole tree is built.

For example, assume we have two input features, user age and music track length, graphed in figure 3, and want to predict whether the user will like the song or not.

Figure 3. Like / dislike distribution and possible splits

The algorithm evaluates all possible splits of track length and chooses the best one. This is a brute-force approach—we just iterate through all conditions to find the best result. For the problem from the picture we should split the points by music track length, approximately equal to 8 minutes.

Now we have two partitions and the next step is to select one of them to split once again, this time by age. The algorithm repeats, selecting a partition and splitting it by some feature until we reach the stopping condition, shown in figure 4.

Figure 4. Splitting process

Different stopping conditions and methods of selecting the next partition lead to different learning schemes. The most well-known of these are the leaf-wise and depth-wise approaches.

In the leaf-wise approach, the algorithm splits the partition to achieve the best improvement of the loss function and the procedure continues until we obtain a fixed number of leaves. The algorithm in the depth-wise approach builds the tree level by level until a tree of a fixed depth is built.

CatBoost uses symmetric or oblivious trees. The trees from the music example above are symmetric. In fact, they can be represented as decision tables, as figure 5 shows.

Figure 5. Decision tree for music example

CatBoost uses the same features to split learning instances into the left and the right partitions for each level of the tree. In this case a tree of depth k has exactly 2k leaves, and the index of a leaf can be calculated with simple bitwise operations.

Thus, the CatBoost learning scheme is essentially depth-wise with some simplification, obtained from our decision tree type.

The choice of oblivious trees has several advantages compared to the classic ones:

  • Simple fitting scheme
  • Efficient to implement on CPU
  • Ability to make very fast model appliers
  • This tree structure works as a regularization, so it can provide quality benefits for many tasks

Classical decision tree learning algorithm is computation-intensive. To find the next split, we need to evaluate feature count times observation count for different splitting conditions. This leads to a vast number of possible splits for large datasets using continuous inputs and, in many cases, also leads to overfitting.

Fortunately, boosting allows us to significantly reduce the number of splits that we need to consider. We can make a rough approximation for input features. For example, if we have music track length in seconds, then we can round it off to minutes. Such conversions could be done for any ordered features in an automatic way. The simplest way is to use quantiles of input feature distribution to quantize it. This approach is similar in spirit to using 4 or 8 bit floats for neural networks and other compression techniques applied in deep learning.

Thus, boosting each input could be considered as an integer with several distinct values. By default, CatBoost approximates ordered inputs with 7-bit integers (128 different values).

In the next section we’ll discuss how to compute decision trees with 5-bit integers as inputs. These inputs are easier to explain, and we will use them to show the basic idea of computation scheme.

Histograms

If our inputs contain only 5-bit integers, we need to evaluate feature count times 32 different splitting conditions. This quantity does not depend on the number of rows in the input dataset. It also allows us to make an efficient distributed learning algorithm suitable for running on multiple GPUs.

The search for the best split now is just a computation of histograms, shown in figure 6.

Figure 6. Split now is just a computation of histograms

For each input integer feature, we need to compute the number of likes and dislikes for each feature value for classification or two similar statistics for regression. For decision tree learning we compute two statistics simultaneously, sum of gradients and sum of weights.

We use fast shared memory to make aggregation. Since the size of the shared memory is restricted, we group several features in one to achieve maximum performance gain. If every computation block works with 4 features and 2 statistics simultaneously, we can make an efficient histogram layout and computation pipeline.

The basic idea is simple: allocate an independent histogram per warp. The layout of the histogram should be done in such a way that any operation in shared memory would cause no bank conflicts. So the threads in a warp should work with distinct modulo 32 addresses. The algorithm splits the warp shared memory histogram into four groups. The first eight threads will work with the first histogram group, the second eight threads  will do the second group, and so on.

Memory access is done with the following indexing:

32 * bin + 2 * featureId +statisticId + 8 * groupId, where groupId = (threadIdx.x&31)/8

This layout is shown on the figure 7 below.

Figure 7. The layout of the histogram

The layout has a nice property: bank conflicts for memory access do not depend on the input data. The data-dependent part of shared memory addresses always gives 0 modulo 32, 32 * bin.

Thus, if 2 * featureId +statisticId + 8 * groupId provide different addresses modulo 32, then we would have shared memory operation without bank conflicts.

This could be achieved if updates in shared memory are done in several passes. On the first pass, the first thread works with the first feature and the first statistic, the second thread works with the first feature and the second statistic, the third thread works with the second feature and the third statistic and so on.

On the second pass each thread works with the other statistic: the first thread works with the first feature and the second statistic, the second thread works with the first feature and the first statistic.

On the third pass the first thread works with the second feature, the third thread works with the third feature and so on. Below you can see the pseudo-code for updating statistics for one point:

void AddPoint(const ui32 featuresGroup, const float t, const float w) { //target or weight to use first const bool flag = threadIdx.x & 1; //Warp Layout: // feature: 0 0 1 1 2 2 3 3 1 1... // stat: t w t w t w t w t w... histGroupId = (threadIdx.x & 31) / 8; hist = WarpHist + 8 * histGroupId; for (int i = 0; i < 4; i++) { int f = (threadIdx.x / 2 + i) % 4; const int shift = 28 - 4 * f; int bin = featuresGroup >> shift; bin &= 31; int offset0 = 32 * bin + 2 * f + flag; int offset1 = 32 * bin + 2 * f + !flag; hist[offset0] += (flag ? t : w); hist[offset1] += (flag ? w : t); } }

As you can see, the operations hist[offset0] += (flag ? t : w); and hist[offset1] += (flag ? w : t); use distinct modulo 32 addresses, so there is no bank conflicts during aggregation.

This histogram computation scheme makes an important trade-off—no atomic operations in exchange for less occupancy.

Thus, for 32-bin histograms, 2 statistics and 4 features per thread block, we could run at most 384 threads, needing 48KB of shared memory to store shared memory histograms. As a result, our kernel achieves at most 38% of occupancy of Maxwell and later hardware. On Kepler we have just 19% occupancy.

A simple benchmark proves these numbers. First, double the number of working threads in kernel. Now the first 384 threads and second 384 threads are writing to the same addresses with atomic operations, so we have at most two-way atomic conflicts. As you can see from the charts in figure 8 below, the atomic version is slower despite bigger occupancy.

Figure 8. Atomic vs non-atomics 32-bin histograms for different GPU generations

This benchmark measured histogram computation time for the first tree level, the most computationally intensive operation. For deeper levels, fewer gains result from avoiding atomics on Maxwell and newer devices lower because we need to make random access to memory and atomic operations will be hidden by memory latency. However, on deeper levels less computation is required.

We’ve also implemented histograms specializations for 15-bin histograms, 64-bin histograms, 128-bin histograms and 255-bin histograms. Different problems need different numbers of bits to approximate input floats in an optimal way. In some cases 5-bits is enough while others need 8-bits to achieve the best result. CatBoost approximates floats with 7-bit integers by default. Our experiments show this default offers good enough quality. At the same time, this default might be changed for some datasets. If one cares about learning speed, using 5-bits as inputs could give huge benefits in terms of speed while being still generating good quality results.

Categorical features

CatBoost is a special version of GBDT: it perfectly solves problems with ordered features while also supporting categorical features. Categorical feature is a variable that can take one of a limited, and usually fixed number of possible values (categories). For instance, ‘animal’ feature could be set to ‘cat’, ‘dog’, ‘rat’, etc.

Dealing with categorical features efficiently is one of the biggest challenges in machine learning.

The most widely used technique to deal with categorical predictors is one-hot-encoding. The original feature is removed and a new binary variable is added for each category. For example, if the categorical feature was animal, then new binary variables will be added – if this object is a ‘cat’, if this object is a ‘dog’ and so on.

This technique has disadvantages:

  1. It needs to build deep decision trees to recover dependencies in data in case of features with high cardinality. This can be solved with hashing trick: categorical features are hashed into several different bins (often 32-255 bins are used). However, this approach still significantly affects the resulting quality.
  2. Doesn’t work for unknown category values, i.e., the values that don’t exist in the learn dataset.

Another way of dealing with categorical features which often provides superior quality compared to hashing and/or one-hot-encoding is to use the so-called label-encoding technique that converts discrete categories to numerical features. Label-encoding can also be used in online-learning systems. We will explain this technique for classification problem below, but it can also be generalized to other tasks.

Since a decision tree works well with numeric inputs we should convert categorical factors to numerical. Let’s replace a categorical feature with several statistics, computed from labels. Assume we have datasets of user/song pairs  and labels (did the user like the song or not?). For each pair there is a set of features with one feature being a music genre.

We can use categorical feature music genre to estimate probability of the user liking a song:

We can use this estimate as a new numerical feature probOfLikeForGenre.

Such probability can be estimated using the learning data in the following way:

Here [proposition] is represented by an Iverson bracket: value is equal to 1 if the proposition is satisfied, and it is equal to 0 otherwise.

This is a decent approach but could lead to overfitting. For example, if we have a genre that is seen only once in the whole training dataset, then the value of the new numerical feature will be equal to the label value.

In CatBoost we use combination of two following techniques to avoid overfitting:

  1. Use bayesian estimators with predefined prior
  2. Estimate probabilities using a scheme, inspired by online-learning

As a result, our estimators have the following form:

The math behind these formulas is fully explained in two of our papers https://arxiv.org/abs/1706.09516 and http://learningsys.org/nips17/assets/papers/paper_11.pdf.

GPU calculation of the desired statistics is simple. You need to group samples by values of genre using RadixSort. Next, run SegmentedScan primitive to compute numerator and denominator statistics. Finally use Scatter operation to obtain a new feature.

One problem with the proposed solution is that the new features do not account well for feature interactions. For example, user tastes could be very different, and their likes for different genres can be different. One person could like rock and hate jazz, the second could love jazz and hate rock. Sometimes combinations of the features can provide new insights about the data. It is not possible to build all features combinations because there are too many.

We have implemented a greedy algorithm to search for feature combinations to solve this problem and still use combinations. We need to compute success rate estimators on the fly for generated combinations. This can’t be done in the preprocessing stage and need to be accelerated on the GPU.

Now that we have described above the way to compute the numeric statistics, the next step is to efficiently group samples by several features instead of a single feature. We built perfect hashes for feature combinations from source ones to accomplish this. This is done efficiently with standard GPU techniques: RadixSort + Scan + Scatter/Gather.

Performance

The next part of this post contains different comparisons that show performance of the library.

Below we will describe in details when it is efficient to use GPUs instead of CPUs for training. Need to say that comparing different ML frameworks on GPUs is a challenge that is beyond the scope of this post; we refer an interested reader to a recent paper «Why every GBDT speed benchmark is wrong» that goes into related issues in detail.

CPU vs GPU

The first benchmark shows the speed comparison of CPU versus GPU. For this benchmark we have used a dual-socket Intel Xeon E5-2660v4 machine with 56 logical cores and 512GB of RAM as a baseline and several modern GPUs (Kepler K40, Maxwell M40, Pascal GTX 1080Ti and Volta V100) as competitors.

The CPU used in this comparison is very powerful; more mainstream CPUs would show larger differences relative to GPUs.

We couldn’t find a big enough openly available dataset to measure performance gains on GPUs, because we wanted to have a dataset with huge numbers of objects to show how the speedup changes when the dataset grows. So, we’ve used an internal Yandex dataset with approximately 800 ordered features and different sample counts. You can see the results of this benchmark in figure 9.

Figure 9. Relative speed-ups of GPUs compared to dual-socket Intel Xeon E5-2660v4 server.

Even a powerful CPU can’t beat a Kepler K40 on large datasets. Volta demonstrates an even more impressive gain, almost 40 times faster than the CPU, without any Volta-specific optimizations.

For multi-classification mode, we see even greater  performance gains. Some of our internal datasets which include 200 features, 4659476 objects and 80 classes train on a Titan V 90 to 100 times faster than the CPU.

The main takeaway from this chart: the more data you have, the bigger the speedup. It makes sense to use GPU starting from some tens of thousands of objects. The best results from GPU are observed with big datasets containing several million examples.

Distributed learning

CatBoost can be efficiently trained on several GPUs in one machine. However, some of our datasets at Yandex are too big to fit into 8 GPUs, which is the maximum number of GPUs per server.

NVIDIA announced servers with 16 GPUs with 32GB of memory on each GPU at the 2018 GTC, but dataset sizes become bigger each year, so even 16 GPUs may not be enough. This prompted us to implement the first open source boosting library with computation on distributed GPUs,  enabling CatBoost to use multiple hosts for accelerated learning.

Computations scale well over multiple nodes. The chart below shows the training time for the first 200 iteration of some Yandex production formulas in different setups: one machine with onboard GPUs, connected with PCIe, two machines, with 2,4 or 8 GPUs per machine, connected via 1GbE, and  two systems with 2,4, or 8 GPUs, connected with Mellanox InfiniBand 56Gb.

CatBoost achieves good scalability as shown in the chart in figure 10. On 16 GPUs with InfiniBand, CatBoost runs approximately 3.75 faster than on 4 GPUs. The scalability should be even better with larger datasets. If there is enough data, we can train models on slow 1GbE networks, as two machines with two cards per machine are not significantly slower than 4 GPUs on one PCIe root complex.

Figure 10. CatBoost scalability for multi-GPU learning with different GPU interconnections (PCI, 1GbE, Mellanox InfiniBand) Conclusion

In this post we have described the CatBoost algorithm and basic ideas we’ve used to build its GPU version. Our GPU library is highly efficient and we hope that it will provide great benefits for our users. It is very easy to start using GPU training with CatBoost and you can train models using Python, R or, command line binary, which you can review in our documentation. Give it a try!

The post CatBoost Enables Fast Gradient Boosting on Decision Trees Using GPUs appeared first on NVIDIA Developer Blog.

Opera builds cryptocurrency wallet into browser

During the recent Hard Fork Decentralized blockchain event in London, Opera announced that its mobile browser for Android will now include cryptocurrency wallet integration and Web 3 support.

With the latest version of the company's mobile browser, users will be able to send and receive Ethereum

By adding cryptocurrency support, Opera believes its browser has the potential to renew and extend its role as a tool to access information, make transactions online and manage users' online identity in a way which gives them more control.

Opera for Android will use the infrastructure platform Infura to access the Ethereum blockchain since its provides secure, reliable and scalable access to Ethereum.

Web 3 support

Web 3 is an umbrella term used to describe a set of emerging technologies intersecting cryptocurrencies, blockchains and distributed systems that, when used together, extend the capabilities of the web.

Opera believes that the web we know today will be the interface to the decentralised web of tomorrow in the form of Web 3 which is why the company chose to include it in the latest release of its browser.

Web 3 itself faces a number of challenges before reaching wider adoption such as users' understanding of new terminology, difficulties acquiring cryptocurrency and complicated installation procedures.

Users are more likely to adopt new solutions when they are user-friendly and seamless which is why Opera decided to integrate Web 3 alongside the current web in the latest update for its Android browser.

EVP of Browsers at Opera Krystian Kolondra explained why the company chose to include Web 3 support in its latest release, saying:

“We are empowering Android smartphone users with an innovative browser that gives them the opportunity to experience Web 3 in a seamless way. I would like to invite all tech enthusiasts who may have heard of blockchain but haven't yet experienced it to simply give our new browser and Web 3 a try. We have made it extremely easy. Our hope is that this step will accelerate the transition to cryptocurrencies from speculation and invetment to being used for actual payments and transactions in our users' daily lives.”

Gigabyte GeForce RTX 2060 leak reveals photos and robust specs

It’s starting to look like Nvidia will have more than mobile versions of its GeForce RTX graphics at CES 2019 as there’s a fresh leak about the Nvidia GeForce RTX 2060.

Videocardz’s spies at Gigabyte released a photo of their version of the upcoming GeForce RTX 2060 alongside some specs. The card purportedly features the TU106 GPU we expected alongside 1920 CUDA cores, 6GB of GDDR6 memory and a maximum frequency of 1,200MHz.

Comparatively, the Nvidia GTX 1060 featured 1280 CUDA cores, 6GB of GDDR5 memory (GDDR5X after a recent refresh) and a boost clock of 1,708MHz. 

The RTX 2060 lower rumored maximum frequency has us a little worried. However, every GPU in the Nvidia Turing series so far has also featured lower overall clockspeeds but still far greater performance over their predecessors. 

There are still a lot of specifics that we don’t know about the GeForce RTX 2060 including how many RTX and Tensor cores it will feature, when it will release and for how much. There’s also the question of how legitimate Videocardz sources are. These images are just renderings, which could just as easily portray any of Gigabyte’s current graphics cards, and the specs could be entirely made up as well.

Good news is we may soon know what’s what as Nvidia will hold its usual CES press conference early next month. While the majority of the keynote will be about deep learning and autonomous cars, we fully expect to get in some updates on the company’s new graphics cards as well.

Via Fudzilla

Lead image credit: Videocardz

Sharp boosts printer security with new launches

In an effort to help accelerate the growth of the technology-driven workplace, Sharp Imaging and Information Company of America (SIICA) has introduced a new line of its Advanced and Essential Series multifunction printers.

The eleven new models include seven that are available now with four models that will be available in early summer 2019. All of the new models come with an easy-to-use touchscreen display and now Sharp has added new features for conversational AI cloud integration and security enhancements.

The new MFP Voice feature, powered by Amazon Alexa, allows users to interact with their printer by using simple voice commands. Support for clouds services has also been expanded with the addition of Box and Dropbox.

Users will also now be able to directly print PDF files from a variety of sources thanks to Adobe's Embedded Print Engine. Administrators will even be able to add new applications and update existing ones through Sharp's new Application Portal which will be available in spring 2019.

Enhanced security

Both the new colour Advanced and Essentials Series deliver leading edge security features including firmware attack prevention and a self-recovery capability which can detect any malicious intrusions and restore the machines firmware back to its original state.

A whitelisting feature has also been added to protect the machines' file systems from unauthorised access.

Vice President of Product Management at SIICA Shane Coffey explained the company's motivation for launching its latest line of multifunction printers, saying:

“This is an exciting time at Sharp, as we roll out a new line of color workgroup MFPs that offer greater integration, enhanced workflow capability, and voice integration to meet the demands of today’s technology-driven workplace. Our focus continues to be to provide quality MFPs that deliver superior productivity, performance and ease-of-use. We’re proud to continue that tradition with this new lineup of color Advanced and Essentials series models.” 

The MX-3071, MX-3571 and MX-4071 from the Advanced Series and the MX-2651, MX-3051, MX-3551, and the MX-4051 from the Essentials Series are available now with four additional models joining the lineup in early summer 2019.

You can play classic Sega Genesis games with Amazon Fire TV

Just in time for the holidays, Amazon has announced that it's bringing childhood nostalgia to Fire TV with the brand new Sega Classics games bundle. 

To celebrate the 30th anniversary of the Sega Genesis, Fire TV customers will get access to 25 Sega Genesis (or Mega Drive if you're in the UK) games that have stood the test of time, including Sonic the Hedgehog, Golden Axe, and Streets of Rage.

If you already have Amazon's Fire TV, you'll be able to play using your Fire TV remote, meaning there's no need to pay for a pricey console or special controller – although you can pair a compatible Bluetooth controller with your Fire TV if you prefer. 

Here's the full list of classic games included in the bundle:

  • Sonic the Hedgehog
  • Sonic the Hedgehog 2
  • Sonic CD
  • The Revenge of Shinobi
  • Ristar
  • Golden Axe
  • Beyond Oasis
  • Decap Attack
  • ESWAT: City Under Siege
  • Street of Rage
  • Street of Rage II
  • Street of Rage III
  • Gunstar Heroes
  • Dynamite Headdy
  • Dr Robotnik's Mean Bean Machine
  • Columns
  • Bio-Hazard Battle
  • Comix Zone
  • Alien Storm
  • Bonanza Bros
  • Golden Axe II
  • Golden Axe III
  • Gain Ground
  • Altered Beast
  • Sonic Spinball
Fun for the family

Sega Classics is available now on Fire TV for £11.99, and the bundle includes 15 multiplayer games, making it a great purchase ahead of Christmas when you'll need to find ways to entertain friends and family. 

The launch of the new app follows the news that Sega's highly anticipated Mega Drive Mini console will be delayed until 2019  – but with Fire TV you needn't wait that long to conquer Green Hill Zone all over again. 

Need a Google Home Mini before Christmas? Then don't miss these deals

Just because Black Friday is behind us now doesn't mean you've missed your chance to get a cheap Google Home Mini. Actually, you have even more options today on this random day in December than you did during the recent sale event.

You can get a Google Home Mini on its own for just $29 and in any color too, including the brand new Aqua version (pictured below). Or if you're after a few more smart home items, you can bundle the smart speaker with some of those too. Take a look below for details.

Oh, you thought we were done with the discounts? Not quite yet. If you've had your eye on Google's other products we're delighted to report deals on them at Walmart too. The Google Home Hub (the one with the screen) is down from $149 to $129. While the original Google Home, which has a louder speaker than the Mini is down from $129 to $99. These are likely to be the lowest prices you'll see on these two just before Christmas, but we're not quite as excited for these two deals because they were both a further $20 cheaper around Black Friday. 

The aforementioned Google Home Mini deals are the best we've ever seen though and we've had our eye on them since the speaker launched over on our Google Home deals page. If you're wanting to check out the competition though, take a peek at our the latest Amazon Echo prices and deals guide.

Still Christmas shopping and need some ideas? We have dozens of guides to the cheapest prices on laptops, TVs, gaming consoles, gadgets, cell phones and more over on our best deals section.

The best deal on our favourite mobile of 2018 - the Samsung Galaxy S9 Plus

The question of 'what is the best phone?' is a very fierce discussion, just read the comments on any phone related news story! But we've been sticking with the same answer since the device saw its release - the Samsung Galaxy S9 Plus. 

The S9 Plus really is a powerhouse in every kind of way. It's big, muscular, stunning to look at and has an incredible camera especially in low light. With all the popularity being around big phones and phablets, this really is the best one out there.

Not only is this the best device but right now you can also get it for a seriously affordable price with this Fonehouse deal. 30GB of data for just £86.99 upfront and £36 per month. That is enough data to binge around 10 hours of HD Netflix in one go (not that we would recommend that) or send around 45 million Whatsapp messages (we definitely wouldn't recommend that). 

All sound good to you? Well scroll down to see all of the details in full or if for whatever reason this deal hasn't won you over, check out our guide on all of the best S9 Plus deals available right now. 

This Samsung S9 Plus deal in full

Surface Pro deal and Surface Laptop deal are both up to $360 off

Just in time for the last day to order something online and expect to receive it before Christmas, Microsoft has slashed prices on its 2017 Surface Pro and Surface Laptop models through Best Buy.

Better yet, the Surface Pro doorbuster deal through Best Buy includes a Type Cover in the package – something we've hoped Microsoft would do on its own for years.

The Surface Laptop on offer is more of a standard affair with a simple, if sizable, price cut. Regardless, both are excellent deals coming in at the 11th hour of Christmas shopping.

When you look at these deals, you might be thinking, "Why are you showing me deals on old products?"

That's because these 'old' products barely qualify as such, and still have plenty of oomph left in them to keep up with computing trends for the next several years. Take it from us: sometimes it pays to wait for moments like this to grab a coveted laptop or tablet that's a fraction less powerful for a much lower price.

For one of the cheapest entry points into Microsoft's luxury laptop world, we'd suggest hopping on this deal quickly – we doubt it will last up to the expiration date.

The best mobile hotspots for 3G and 4G in 2018

If you do a lot of travelling and don't want to put your data or information at risk by relying on other people's Wi-Fi connections, then you'll want a mobile Wi-Fi hotspot device, which are commonly known as Mi-Fi.

These Mi-Fi hubs allow for several devices to connect to one or more data SIMs – usually 4G ones – to surf privately and safely. Much like a smartphone's SIM card, these data-only deals allow you access to the internet in complete security. You can use a monthly contract, or use a pay-as-you-go SIM, so you know exactly how much data you are paying for. The best thing about 4G data is that it is incredibly fast, so you'll be able to browse the internet as if you were connected to a broadband connection. It means it will also often be faster than free or shared Wi-Fi spots, which usually have data limits and a lot of traffic to deal with.

Mi-Fi hubs range from a simple one SIM solution with a battery to models that can accommodate 10 different SIM cards, or others that even sport a complete Android operating system.

Below are the best mobile Wi-Fi routers you can buy in the UK, catering to all tastes, from frugal surfers to power users and everything in between.

The TP-Link M7350 is an excellent mobile hotspot, supporting both micro and nano SIM cards, which means it's almost certainly going to be compatible with a SIM card you already own. It has a small display for informing you about your connection, and it supports dual band Wifi on both 2.4 and 5GHz. It can be accessed by up to 10 devices at once, and performance is very good on 4G LTE. Its battery life is also excellent, giving you around 10 hours of 4G connectivity.

The EE 4GEE WiFi Mini is one of the better looking mobile hotspot devices on this list, and its compact design means it can be easily carried around with you. The 1500mAh offers up to 50 hours on standby, and up to six hours when connected to the internet. It can support up to 10 devices at once, but it doesn't have an LCD screen, like the TP-Link M7350, which means it's not quite as user friendly. You also need to use the EE network, which isn't too much of a hardship due to EE's coverage and fast 4G speeds, and the network offers a range of data plans to go alongside the EE 4GEE WiFi Mini.

Netgear's AC810 Aircard is an excellent mobile hotspot that lets you quickly and easily share a fast 4G LTE internet connection with a wide range of devices. Supporting up to 15 devices, this is a very flexible bit of kit, and its 2930 mAh battery is capable of 11 hours operating time and 260 hours when in standby.

An attractive touchscreen gives you all the information you need, and allows you to manage connections and change settings on the fly. It's put together with a robust build quality we've come to expect from Netgear, and in our view this is one of the best Mi-Fi portable hotspots money can buy right now.

The Mobile Wi-Fi Pro from Huawei, otherwise known as the E5770, ticks a lot of boxes for power users. This 4G/LTE model (Cat-4, so only 150Mbps) has one of the biggest batteries we’ve seen on any Mi-Fi device at 5,200mAh. It can even charge another device thanks to a bundled cable that doubles as a stylish strap. Up to 10 devices can be connected with a quoted working time of up to 20 hours.

If that wasn’t good enough, it’s also the only hotspot that we’re aware of that comes with a microSD card slot (sadly taking FAT-formatted cards only) and an Ethernet port. That makes it perfect for small businesses and even, dare we say, a perfect cord-cutting device if paired with the right SIM card.

This is the antithesis of your traditional pocket-sized hotspot and we’re bending the rules to include it in this article. Behold the Netgear Nighthawk R7100LG, a router with a SIM card slot. Technically, it is not portable as the device requires a mains power supply, but there are potential workarounds if you really want to make this happen.

The Nighthawk is a great solution should you want to offer internet access to a massive amount of users, and indeed storage access as well. It offers Cat 6/LTE (300Mbps), AC1900 Wi-Fi, two USB ports, a free app to manage the router (Genie), four Gigabit Ethernet ports plus open source support and a wealth of security features.

TechRadar's downloads advent calendar: get Steganos Safe 19 free

The holidays are an expensive time, so we’re bringing you a special treat: a full, free Windows program to download every day until Christmas.

The 13th program in our free downloads advent calendar is Steganos Safe 19 – a secure digital vault to protect all your more important and personal files.

Steganos Safe is designed for the data you don't want anyone else to see, whether it's stored on your PC, an external drive, a USB stick, or in the cloud.

This incredibly hand program can create safes up to 2TB in size, and secure them with 384-bit AES-XEX encryption. You can secure them with alphanumeric passwords, PicPass, or a USB key to keep out snoopers, and hide them in plain sight as ordinary files.

Once you've unlocked your safe, it integrates seamlessly with Windows and behaves just like any other drive until you re-lock it. 

Steganos Safe is incredibly easy to use. Download it, request your free serial number and start protecting your files today.

In case you missed it...

Amazon's last day for Christmas delivery revealed: these are best deals to get now

As Christmas is fast approaching (less than two weeks away!) so are holiday shipping deadlines for online retailers. To ensure your gifts will get there on time and to avoid last-minute shipping fees, Amazon has just released its Holiday Delivery Calendar. Below are Amazon's cutoff dates for free Christmas delivery.

  • December 18: Last day for standard shipping
  • December 22: Last day for Amazon Prime free two-day shipping
  • December 23: Last day for Amazon Prime free one-day shipping
  • December 24: Last day for Amazon prime free same-day delivery (select cities)

Now that you know Amazon's cut-off dates for Christmas delivery, its time to start shopping for gifts. While last-minute shopping can be stressful, we've rounded up a wide selection of gifts that will meet the shipping deadlines listed above. From pressure cookers to tablets, we've found a variety of top-selling gifts for anyone on your list.

Shop our top gifts below and make sure you check back to see new items added daily.

The best powerline adapters 2018: top picks for expanding your home network

In our list of the best powerline adapters of 2018 you'll find the best devices for plugging into your home's power supply to turn it into a high-speed network.

The best powerline adapters are useful tools for making sure that every device in your home has access to the internet, especially if they rely on a wired internet connection. Rather than trailing Ethernet cables throughout your home or office, you can use powerline adapters to spread the network throughout your building.

It can also eliminate Wi-Fi blackspots, as many of the best powerline adapters have built-in Wi-Fi that can help extend your wireless network.

The best powerline adapters are also incredibly easy to install - just plug one into a power socket by your router or modem, and connect it via an Ethernet cable. Then, place a second adapter where you want to supply the network or internet to, and connect any devices to the second adaptor.

You can add more adapters throughout the building, and their network speeds are much faster than Wi-Fi, and thick walls and floors won't affect them. You will need the power lines in your building to be in good working order, however. Some Wi-Fi adapters also include Wi-Fi antennae for bringing wireless networks into difficult to reach parts of a building.

There's a large range of power line adapters available to buy, so to help you we've created this list of the very best power line adapters.

The Devolo dLAN 1200+ WiFi ac is one of the fastest powerline adapters on the market, able to reach speeds of 1.2 gigabits a second - though you should note that you won't often get those kinds of speeds, as there are numerous factors that can affect powerline speeds.

Still, this is a very fast powerline adaptor, and the fact it can also broadcast dual-band wireless ac networks makes this a very versatile powerline adaptor. The adaptor also has a pass through power port, which means you won't lose a power socket - just plug other devices into the adaptor itself.

The TP-Link AV2000 takes the award for fastest power line adaptor, with a maximum speed of 2000Mbps - though of course actual speeds will be lower. Still, it offers fantastic speeds, along with built-in dual band wireless ac networks and a pass through socket.

In the starter kit you'll get two adapters, one has a single Ethernet port, which you should use to connect to your modem or router, and the second one has two Ethernet ports for connecting wired devices.

Asus may not be the first company that comes to mind when you think about networking devices, but it makes some very good products - such as the Asus 1200Mbps AV2 1200 Wi-Fi Powerline Adapter.

As the name suggests this is a very fast powerline adaptor that is also able to broadcast a Wi-Fi network as well. Unlike other powerline adapters with Wi-Fi, the Asus 1200Mbps AV2 1200 Wi-Fi Powerline Adapter has external antennae, which allow you to angle them for increased coverage - though it does mean the units themselves look a little bit ugly compared to some of its competitors.

If you're looking for a nice, cheap but reliable powerline adaptor, than the TP-LINK AV600 is a great choice. It's a lot less money than many of the adapters on this list, but it still manages to offer plenty of features.

For example, it can broadcast Wi-Fi, and with a Wi-Fi clone button, you can easily extend your existing wireless network. it also features two Ethernet ports, and the second adaptor in the set includes a pass through socket so you don't lose out on a power socket when using the adaptor. While the 600Mbps top speed isn't as high as others on this list, it's still enough to transfer big files and stream media around your home.

eThis is an excellent entry-level power adaptor that does a very good job at transmitting your network traffic over your powerlines. It doesn't boast the highest speeds, nor does it have Wi-Fi or a passthrough socket, but it does the job well considering the price, and its low cost means it's easy to add adapters to your network in the future.

NVIDIA Jetson AGX Xavier Delivers 32 TeraOps for New Era of AI in Robotics

The world’s ultimate embedded solution for AI developers, Jetson AGX Xavier, is now shipping as standalone production modules from NVIDIA. A member of NVIDIA’s AGX Systems for autonomous machines, Jetson AGX Xavier is ideal for deploying advanced AI and computer vision to the edge, enabling robotic platforms in the field with workstation-level performance and the ability to operate fully autonomously without relying on human intervention and cloud connectivity. Intelligent machines powered by Jetson AGX Xavier have the freedom to interact and navigate safely in their environments, unencumbered by complex terrain and dynamic obstacles, accomplishing real-world tasks with complete autonomy. This includes package delivery and industrial inspection that require advanced levels of real-time perception and inferencing to perform. As the world’s first computer designed specifically for robotics and edge computing, Jetson AGX Xavier’s high-performance can handle visual odometry, sensor fusion, localization and mapping, obstacle detection, and path planning algorithms critical to next-generation robots.  Figure 1 shows the production compute modules now available globally. Developers can now begin deploying new autonomous machines in volume.

Figure 1. Jetson AGX Xavier embedded compute module with Thermal Transfer Plate (TTP), 100x87mm

The latest generation of NVIDIA’s industry-leading Jetson AGX family of embedded Linux high-performance computers, Jetson AGX Xavier delivers GPU workstation class performance with an unparalleled 32 TeraOPS (TOPS) of peak compute and 750Gbps of high-speed I/O in a compact 100x87mm form-factor. Users can configure operating modes at 10W, 15W, and 30W as needed for their applications. Jetson AGX Xavier sets a new bar for compute density, energy efficiency, and AI inferencing capabilities deployable to the edge, enabling next-level intelligent machines with end-to-end autonomous capabilities.

Jetson powers the AI behind many of the world’s most advanced robots and autonomous machines using deep learning and computer vision while focusing on performance, efficiency, and programmability. Jetson AGX Xavier, diagrammed in figure 2, consists of over 9 billion transistors, based on the most complex System-on-Chip (SoC) ever created. The platform comprises an integrated 512-core NVIDIA Volta GPU including 64 Tensor Cores, 8-core NVIDIA Carmel ARMv8.2 64-bit CPU, 16GB 256-bit LPDDR4x, dual NVIDIA Deep Learning Accelerator (DLA) engines, NVIDIA Vision Accelerator engine, HD video codecs, 128Gbps of dedicated camera ingest, and 16 lanes of PCIe Gen 4 expansion. Memory bandwidth over the 256-bit interface weighs in at 137GB/s, while the DLA engines offload inferencing of Deep Neural Networks (DNNs).NVIDIA’s JetPack SDK 4.1.1 for Jetson AGX Xavier includes CUDA 10.0, cuDNN 7.3, and TensorRT 5.0, providing the complete AI software stack.

Figure 2. Jetson AGX Xavier offers a rich set of high-speed I/O

This enables developers the ability to deploy accelerated AI in applications like robotics, intelligent video analytics, medical instruments, embedded IoT edge devices, and more. Like its predecessors Jetson TX1 and TX2, Jetson AGX Xavier uses a System-on-Module (SoM) paradigm. All the processing is contained onboard the compute module and high-speed I/O lives on a breakout carrier or enclosure that’s provided through a high-density board-to-board connector. Encapsulating functionality on the module in this way makes it easy for developers to integrate Jetson Xavier into their own designs. NVIDIA has released comprehensive documentation and reference design files available to download for embedded designers creating their own devices and platforms using Jetson AGX Xavier. Be sure to consult the Jetson AGX Xavier Module Data Sheet and Jetson AGX Xavier OEM Product Design Guide for the full product features listed from Table 1, in addition to electromechanical specifications, the module pin-out, power sequencing, and signal routing guidelines.

 

Table 1: Jetson AGX Xavier System-on-Module features and capabilities NVIDIA Jetson AGX Xavier Module CPU 8-core NVIDIA Carmel 64-bit ARMv8.2 @ 2265MHz GPU 512-core NVIDIA Volta @ 1377MHz with 64 TensorCores DL Dual NVIDIA Deep Learning Accelerators (DLAs) Memory 16GB 256-bit LPDDR4x @ 2133MHz | 137GB/s Storage 32GB eMMC 5.1 Vision (2x) 7-way VLIW Vision Accelerator Encoder* (4x) 4Kp60 | (8x) 4Kp30 | (16x) 1080p60 | (32x) 1080p30
Maximum throughput up to (2x) 1000MP/s – H.265 Main Decoder* (2x) 8Kp30 | (6x) 4Kp60 | (12x) 4Kp30 | (26x) 1080p60 | (52x) 1080p30
Maximum throughput up to (2x) 1500MP/s – H.265 Main Camera† (16x) MIPI CSI-2 lanes, (8x) SLVS-EC lanes; up to 6 active sensor streams and 36 virtual channels Display (3x) eDP 1.4 / DP 1.2 / HDMI 2.0 @ 4Kp60 Ethernet 10/100/1000 BASE-T Ethernet + MAC + RGMII interface USB (3x) USB 3.1 + (4x) USB 2.0 PCIe†† (5x) PCIe Gen 4 controllers  | 1×8, 1×4, 1×2, 2×1 CAN Dual CAN bus controller Misc I/Os UART, SPI, I2C, I2S, GPIOs Socket 699-pin board-to-board connector, 100x87mm with 16mm Z-height Thermals‡ -25°C to 80°C Power 10W / 15W / 30W profiles, 9.0V-20VDC input *Maximum number of concurrent streams up to the aggregate throughput. Supported video codecs:  H.265, H.264, VP9
Please refer to the Jetson AGX Xavier Module Data Sheet §1.6.1 and §1.6.2 for specific codec and profile specifications.
†MIPI CSI-2, up to 40 Gbps in D-PHY V1.2 or 109 Gbps in CPHY v1.1
SLVS-EC, up to 18.4 Gbps
††(3x) Root Port + Endpoint controllers and (2x) Root Port controllers
‡Operating temperature range, Thermal Transfer Plate (TTP) max junction temperature.

 

Jetson AGX Xavier includes more than 750Gbps of high-speed I/O, providing an extraordinary amount of bandwidth for streaming sensors and high-speed peripherals.  It’s one of the first embedded devices to support PCIe Gen 4, providing 16 lanes across five PCIe Gen 4 controllers, three of which can operate in root port or endpoint mode.  16 MIPI CSI-2 lanes can be connected to four 4-lane cameras, six 2-lane cameras, six 1-lane cameras, or a combination of these configurations up to six cameras, with 36 virtual channels allowing more cameras to be connected simultaneously using stream aggregation.  Other high-speed I/O includes three USB 3.1 ports, SLVS-EC, UFS, and RGMII for Gigabit Ethernet. Developers now have access to NVIDIA’s JetPack 4.1.1 Developer Preview software for Jetson AGX Xavier, listed in table 2. The Developer Preview includes Linux For Tegra (L4T) R31.1 Board Support Package (BSP) with support for Linux kernel 4.9 and Ubuntu 18.04 on the target. On the host PC side, JetPack 4.1.1 supports Ubuntu 16.04 and Ubuntu 18.04.

 

Table 2: Software components included in JetPack 4.1.1 Developer Preview and L4T BSP for Jetson AGX Xavier NVIDIA JetPack 4.1.1 Developer Preview Release Linux For Tegra R31.0.1 (K4.9) Ubuntu 18.04 LTS aarch64 CUDA Toolkit 10.0 cuDNN 7.3 TensorRT 5.0 GA GStreamer 1.14.1 VisionWorks 1.6 OpenCV 3.3.1 OpenGL 4.6 / GLES 3.2 Vulkan 1.1 NVIDIA Nsight Systems 2018 NVIDIA Nsight Graphics 1.0 Multimedia API R31.1 Argus 0.97 Camera API

 

The JetPack 4.1.1 Developer Preview release allows developers to immediately begin prototyping products and applications with Jetson AGX Xavier in preparation for production deployment.  NVIDIA will continue making improvements to JetPack with additional feature enhancements and performance optimizations. Please read the Release Notes for highlights and software status of this release.

Volta GPU

The Jetson AGX Xavier integrated Volta GPU, shown in figure 3, provides 512 CUDA cores and 64 Tensor Cores for up to 11 TFLOPS FP16 or 22 TOPS of INT8 compute, with a maximum clock frequency of 1.37GHz. It supports CUDA 10 with a compute capability of sm_72. The GPU includes eight Volta Streaming Multiprocessors (SMs) with 64 CUDA cores and 8 Tensor Cores per Volta SM. Each Volta SM includes a 128KB L1 cache, 8x larger than previous generations. The SMs share a 512KB L2 cache and offers 4x faster access than previous generations.

Figure 3. Jetson AGX Xavier Volta GPU block diagram

Each SM consists of 4 separate processing blocks referred to as SMPs (streaming multiprocessor partitions), each including its own L0 instruction cache, warp scheduler, dispatch unit, and register file, along with CUDA cores and Tensor Cores.  With twice the number of SMPs per SM than Pascal, the Volta SM features improved concurrency and supports more threads, warps, and thread blocks in flight.

Tensor Cores

NVIDIA Tensor Cores are programmable fused matrix-multiply-and-accumulate units that execute concurrently alongside CUDA cores. Tensor Cores implement new floating-point HMMA (Half-Precision Matrix Multiply and Accumulate) and IMMA (Integer Matrix Multiply and Accumulate) instructions for accelerating dense linear algebra computations, signal processing, and deep learning inference.

Figure 4. Tensor Core HMMA/IMMA 4x4x4 matrix multiply and accumulate

The matrix multiply inputs A and B are FP16 matrices for HMMA instructions, while the accumulation matrices C and D may be FP16 or FP32 matrices. For IMMA, the matrix multiply input A is a signed or unsigned INT8 or INT16 matrix, B is a signed or unsigned INT8 matrix, and both C and D accumulator matrices are signed INT32. Hence the range of precision and computation is sufficient to avoid overflow and underflow conditions during internal accumulation.

NVIDIA libraries including cuBLAS, cuDNN, and TensorRT have been updated to utilize HMMA and IMMA internally, allowing programmers to easily take advantage of the performance gains inherent in Tensor Cores. Users can also directly access Tensor Core operations at the warp level via a new API exposed in the wmma namespace and mma.h header included in CUDA 10.  The warp-level interface maps 16×16, 32×8, and 8×32 size matrices across all 32 threads per warp.

Deep Learning Accelerator

Jetson AGX Xavier features two NVIDIA Deep Learning Accelerator (DLA) engines, shown in figure 5, that offload the inferencing of fixed-function Convolutional Neural Networks (CNNs). These engines improve energy efficiency and free up the GPU to run more complex networks and dynamic tasks implemented by the user. The NVIDIA DLA hardware architecture is open-source and available at NVDLA.org. Each DLA has up to 5 TOPS INT8 or 2.5 TFLOPS FP16 performance with a power consumption of only 0.5-1.5W. The DLAs support accelerating CNN layers such as convolution, deconvolution, activation functions, min/max/mean pooling, local response normalization, and fully-connected layers.

Figure 5. Block diagram of Deep Learning Accelerator (DLA) architecture

DLA hardware consists of the  following components:

  • Convolution Core – optimized high-performance convolution engine.
  • Single Data Processor – single-point lookup engine for activation functions.
  • Planar Data Processor – planar averaging engine for pooling.
  • Channel Data Processor – multi-channel averaging engine for advanced normalization functions.
  • Dedicated Memory and Data Reshape Engines – memory-to-memory transformation acceleration for tensor reshape and copy operations.

Developers program DLA engines using TensorRT 5.0 to perform inferencing on networks, including support for AlexNet, GoogleNet, and ResNet-50. For networks that utilize layer configurations not supported by DLA, TensorRT provides GPU fallback for the layers that are unable to be run on DLAs. The JetPack 4.0 Developer Preview limits DLA precision to FP16 mode initially, with INT8 precision and increased performance for DLA coming in a future JetPack release.

TensorRT 5.0 adds the following APIs to its IBuilder interface to enable the DLAs:

  • setDeviceType() and setDefaultDeviceType() for selecting GPU, DLA_0, or DLA_1 for the execution of a particular layer, or for all layers in the network by default.
  • canRunOnDLA() to check if a layer can run on DLA as configured.
  • getMaxDLABatchSize() for retrieving the maximum batch size that DLA can support.
  • allowGPUFallback() to enable the GPU to execute layers that DLA does not support.

Please refer to Chapter 6 of the TensorRT 5.0 Developer Guide for the full list of supported layer configurations and code examples of working with DLA in TensorRT.

Deep Learning Inferencing Benchmarks

We’ve released deep learning inferencing benchmark results for Jetson AGX Xavier on common DNNs such as variants of ResNet, GoogleNet, and VGG.   We ran these benchmarks for Jetson AGX Xavier using the JetPack 4.1.1 Developer Preview release with TensorRT 5.0 on Jetson AGX Xavier’s GPU and DLA engines. The GPU and two DLAs ran the same networks architectures concurrently in INT8 and FP16 precision respectively, with the aggregate performance being reported for each configuration. The GPU and DLAs can be running different networks or network models concurrently in real-world use cases, serving unique functions alongside each other in parallel or in a processing pipeline. Using INT8 versus full FP32 precision in TensorRT results in an accuracy loss of 1% or less.

First, let’s consider the results from ResNet-18 FCN (Fully Convolutional Network), which is a full-HD model with 2048×1024 resolution used for semantic segmentation.  Segmentation provides per-pixel classification for tasks like freespace detection and occupancy mapping, and is representative of deep learning workloads computed by autonomous machines for perception, path planning, and navigation.  Figure 6 shows the measured throughput of running ResNet-18 FCN on Jetson AGX Xavier versus Jetson TX2.

Figure 6. ResNet-18 FCN inferencing throughput of Jetson AGX Xavier and Jetson TX2

Jetson AGX Xavier currently achieves up to 13x the performance in ResNet-18 FCN inference as compared to Jetson TX2.  NVIDIA will continue releasing software optimizations and feature enhancements in JetPack that will further improve performance and power characteristics over time.  Note that the full listings of the benchmark results report the performance of ResNet-18 FCN for Jetson AGX Xavier up to batch size 32, however in figure 7 we only plot up to a batch size of 16, as Jetson TX2 is able to run ResNet-18 FCN up to batch size 16.

Figure 7. ResNet-18 FCN inferencing energy efficiency of Jetson AGX Xavier and Jetson TX2

When considering energy efficiency using images processed per second per watt, Jetson AGX Xavier is currently up to 6x more power efficient than Jetson TX2 at ResNet-18 FCN.  We calculated efficiency by measuring the total module power consumption using onboard INA voltage and current monitors, including the energy usage of CPU, GPU, DLAs, memory, miscellaneous SoC power, I/O, and regulator efficiency loses on all rails. Both Jetsons were run in 15W power mode. Jetson AGX Xavier and JetPack ship with configurable preset power profiles for 10W, 15W, and 30W, switchable at runtime using the nvpmodel power management tool. Users can also define their own customized profiles with different clocks and DVFS (Dynamic Voltage and Frequency Scaling) governor settings that have been tailored to achieve the best performance for individual applications.

Next, let’s compare Jetson AGX Xavier benchmarks on the image recognition networks ResNet-50 and VGG19 across batch sizes 1 through 128 versus Jetson TX2. These models classify image patches with 224×224 resolution, and are frequently used as the encoder backbone in various object detection networks.  Using a batch size of 8 or higher at the lower resolution can be used to approximate the performance and latency of a batch size of 1 at higher resolutions. Robotic platforms and autonomous machines often incorporate multiple cameras and sensors which can be batch processed for increased performance, in addition to performing detection of regions-of-interest (ROIs) followed by further classification of the ROIs in batches. Figure 8 also includes estimates of future performance for Jetson AGX Xavier, incorporating of software enhancements such as INT8 support for DLA and additional optimizations for GPU.

Figure 8. Estimated performance after INT8 support for DLA and additional GPU optimizations

Jetson AGX Xavier currently achieves up to 18x the throughput of Jetson TX2 on VGG19 and 14x on ResNet-50 measured running the JetPack 4.1.1, as shown in figure 9. The latency of ResNet-50 is as low as 1.5ms, or over 650FPS with a batch size of 1.  Jetson AGX Xavier is estimated to be up to 24x faster than Jetson TX2 with future software improvements.  Note that for legacy comparisons we also provide data for GoogleNet and AlexNet in the full performance listings.

Figure 9. ResNet-50 and VGG19 energy efficiency for Jetson Xavier and Jetson TX2

Jetson AGX Xavier is currently more than 7x more efficient at VGG19 inference than Jetson TX2 and 5x more efficient with ResNet-50, with up to a 10x increase in efficiency when considering future software optimizations and enhancements. Consult the full performance results for additional data and details about the inferencing benchmarks. We also benchmark the CPU performance in the next section.

Carmel CPU Complex

Jetson AGX Xavier’s CPU complex shown in figure 10  consists of four heterogeneous dual-core NVIDIA Carmel CPU clusters based on ARMv8.2 with a maximum clock frequency of 2.26GHz. Each core includes 128KB instruction and 64KB data L1 caches plus a 2MB L2 cache shared between the two cores. The CPU clusters share a 4MB L3 cache.

Figure 10. Block diagram of Jetson Xavier CPU complex with NVIDIA Carmel clusters

The Carmel CPU cores feature NVIDIA’s Dynamic Code Optimization, a 10-way superscalar architecture, and a full implementation of ARMv8.2 including full Advanced SIMD, VFP (Vector Floating Point), and ARMv8.2-FP16.

The SPECint_rate benchmark measures CPU throughput for multi-core systems. The overall performance score averages several intensive sub-tests, including compression, vector and graph operations, code compilation, and executing AI for games like chess and Go. Figure 11 shows benchmark results with a greater than 2.5x increase in CPU performance between generations.

Figure 11. CPU performance of Jetson AGX Xavier vs. Jetson TX2 in SPECInt2K_rate 8x* benchmark  *The Jetson AGX Xavier / Jetson TX2 SPECint benchmarks have not been officially submitted to SPEC, and are considered estimates as of publication.

Eight simultaneous copies of the SPECint_rate tests ran, keeping the CPUs fully loaded. Jetson AGX Xavier naturally has eight CPU cores; Jetson TX2’s architecture uses four ARM Cortex-A57 cores and two NVIDIA Denver D15 cores. Running two copies per Denver core results in higher performance.

Vision Accelerator

Jetson AGX Xavier features two Vision Accelerator engines, shown in figure 12. Each includes a dual 7-way VLIW (Very Long Instruction Word) vector processor that offloads computer vision algorithms such as feature detection & matching, optical flow, stereo disparity block matching, and point cloud processing with low latency and low power. Imaging filters such as convolutions, morphological operators, histogramming, colorspace conversion, and warping are also ideal for acceleration.

Figure 12. Block diagram of Jetson AGX Xavier VLIW Vision Accelerator architecture

Each Vision Accelerator includes a Cortex-R5 core for command and control, two vector processing units (each with 192KB of on-chip vector memory), and two DMA units for data movement. The 7-way vector processing units contain slots for two vector, two  scalar, and three memory operations per instruction. The Early Access software release lacks support for the Vision Accelerator but will be enabled in a future version of JetPack.

NVIDIA Jetson AGX Xavier Developer Kit

The Jetson AGX Xavier Developer Kit contains everything needed for developers to get up and running quickly The kit includes the Jetson AGX Xavier compute module, reference open-source carrier board, power supply, and JetPack SDK, enabling users to quickly begin developing applications. The Jetson AGX Xavier Developer Kit can be purchased for only $1,299.

Figure 13. Jetson AGX Xavier Developer Kit, including Jetson AGX Xavier module and reference carrier board.

At 105mm2, the Jetson AGX Xavier Developer Kit is significantly smaller than the Jetson TX1 and TX2 Developer Kits while improving available I/O. I/O capabilities include two USB3.1 ports (supporting DisplayPort and Power Delivery), a hybrid eSATAp + USB3.0 port, a PCIe x16 slot (x8 electrical), sites for M.2 Key-M NVMe and M.2 Key-E WLAN mezzanines, Gigabit Ethernet, HDMI 2.0, and an 8-camera MIPI CSI connector. See Table 3 below for a full list of the I/Os available through the developer kit reference carrier board.

 

Table 3. I/O ports available on Jetson AGX Xavier Developer Kit Developer Kit I/Os Jetson AGX Xavier Module Interface PCIe x16 PCIe x8 Gen 4 / SLVS x8 RJ45 Gigabit Ethernet USB-C 2x USB3.1 (DisplayPort optional) (Power Delivery optional) Camera connector 16x MIPI CSI-2 lanes, up to 6 active sensor streams M.2 Key M NVMe x4 M.2 Key E PCIe x1 (for Wi-Fi / LTE / 5G) + USB2 + UART + I2S/PCM 40 pin header UART + SPI + CAN + I2C + I2S + DMIC + GPIOs HD Audio header High Definition Audio eSATAp + USB 3.0 SATA via PCIe x1 bridge (Power + Data for 2.5” SATA) + USB3.0 HDMI Type A HDMI 2.0, eDP 1.2a, DP 1.4 uSD/ UFS card socket SD/UFS

 

We’ve pulled together an open-source Two Days to a Demo deep learning tutorial for Jetson AGX Xavier that guides developers through training and deploying DNN inferencing to perform image recognition, object detection, and segmentation, enabling you to rapidly begin creating your own AI applications. Two Days to a Demo uses NVIDIA DIGITS interactive training system in the cloud or a GPU-accelerated PC, and uses TensorRT to perform accelerated inferencing on images or live camera feeds on Jetson.  The Two Days to a Demo code repository on GitHub has been updated to include support for the Xavier DLAs and GPU INT8 precision.

Intelligent Video Analytics (IVA)

AI and deep learning enables vast amounts of data to be effectively utilized for keeping cities safer and more convenient, including applications such as traffic management, smart parking, and streamlined checkout experiences in retail stores. NVIDIA Jetson and NVIDIA DeepStream SDK enables distributed smart cameras to perform intelligent video analytics at the edge in real time, reducing the massive bandwidth loads placed on transmission infrastructure and improving security along with anonymity.

Video capture of IVA demo running on Jetson AGX Xavier with 30 concurrent HD streams

Jetson TX2 could process two HD streams concurrently with object detection and tracking. As shown in the video above, Jetson AGX Xavier is able to handle 30 independent HD video streams simultaneously at 1080p30 — a 15x improvement. Jetson AGX Xavier offers a total throughput of over 1850MP/s, enabling it to decode, pre-process, perform inferencing with ResNet-based detection, and visualize each frame in just over 1 millisecond. The capabilities of Jetson AGX Xavier bring greatly increased levels of performance and scalability to edge video analytics.

A New Era of Autonomy

Jetson AGX Xavier delivers unprecedented levels of performance onboard robots and intelligent machines. These systems require demanding compute capability for AI-driven perception, navigation, and manipulation in order to provide robust autonomous operation. Applications include manufacturing, industrial inspection, precision agriculture, and services in the home. Autonomous delivery robots that deliver packages to end consumers and support logistics in warehouses, stores, and factories represents one class of application.

A typical processing pipeline for fully-autonomous delivery and logistics requires several stages of vision and perception tasks, shown in figure 14. Mobile delivery robots frequently have several peripheral HD cameras that provide 360° situational awareness in addition to LIDAR and other ranging sensors that are fused in software along with inertial sensors. A forward-facing stereo driving camera is often used, needing pre-processing and stereo depth mapping. NVIDIA has created Stereo DNN models with improved accuracy over tradition block-matching methods to support this.

Figure 14. Example AI processing pipeline of an autonomous delivery and logistics robot

Object detection models like SSD or Faster-RCNN and feature-based tracking typically inform obstacle avoidance of pedestrians, vehicles, and landmarks. In the case of warehouse and storefront robots, these object detection models locate items of interest like products, shelves, and barcodes. Facial recognition, pose estimation, and Automatic Speech Recognition (ASR) facilitate Human-Machine Interaction (HMI) so that the robot can coordinate and communicate effectively with humans.

High-framerate Simultaneous Localization and Mapping (SLAM) is critical to keeping the robot updated with accurate position in 3D. GPS alone lacks precision for sub-meter positioning and is unavailable indoors. SLAM performs registration and alignment of the latest sensor data with the previous data the system has accumulated in its point cloud. Frequently noisy sensor data requires substantial filtering to properly localize, especially from moving platforms.

The path planning stage often uses semantic segmentation networks like ResNet-18 FCN, SegNet, or DeepLab to perform free-space detection, telling the robot where to drive unoccluded. Too many types of generic obstacles to detect and track individually frequently exist in the real world, so a segmentation-based approach labels every pixel or voxel with its classification. Together with the previous stages of the pipeline, this informs the planner and control loop of the safe routes that it can take.

The performance and efficiency of Jetson AGX Xavier makes it possible to process onboard all of the components needed in real-time for these robots to function safely with full autonomy, including high-performance vision algorithms for real-time perception, navigation, and manipulation. With standalone Jetson AGX Xavier modules now shipping in production, developers can deploy these AI solutions to the next generation of autonomous machines.

Start Building the Next Wave of Autonomous Machines Today

Jetson AGX Xavier brings game-changing levels of compute to robotics and edge devices, bringing high-end workstation performance to an embedded platform that’s been optimized for size, weight, and power. The production Jetson AGX Xavier compute module is now available globally through distribution, with volume pricing of $1,099 in quantities of 1000 units. Get started and become an NVIDIA Registered Developer today to take advantage of the $1,299 price for the Jetson AGX Xavier Developer Kit, including the ability to download platform documentation and the latest JetPack software for Jetson AGX Xavier. You can also connect with other developers in the community on the DevTalk forums.

For a deep-dive on the Jetson AGX Xavier architecture, view our On-Demand webinar, Jetson AGX Xavier and the New Era of Autonomous Machines.

The post NVIDIA Jetson AGX Xavier Delivers 32 TeraOps for New Era of AI in Robotics appeared first on NVIDIA Developer Blog.

The best Chromebooks 2018

When the first Chromebooks hit the market, no one, including ourselves, knew what to make of them. However, just a few years later, and not only are there more than 25 million Chrome OS users, but the best Chromebooks continue to wow us with all-day battery life – something that Windows 10 laptops still can’t achieve.

The top Chromebooks won’t just have killer battery life, but they’re also inexpensive. This is because they don’t need to feature the latest processor tech – the best Chromebooks pack the hardware they need, and nothing more. And, they keep getting better, just look at the Google Pixel Slate – an extremely promising Chromebook-tablet hybrid.

Some users might feel wary of the top Chromebooks, of being restrained by what your web browser can do – we get it. If you’re looking to play games or do media editing, you might want to look elsewhere. But, if your computer use boils down to word processing, email and video streaming, the best Chromebooks are going to tick all of your boxes.

Shortly after proclaiming the Chromebook Pixel as dead, Google revived it in a way nobody expected. Now, it’s the Google Pixelbook and it stands completely independent of its predecessor. That’s because, unlike the Chromebook Pixel, it can run Android apps natively, on top of building upon Chrome OS. And, when you add in the huge amount of storage space, fantastic stylus and Google Assistant, it shouldn’t surprise you when we say the Pixelbook is the best Chromebook 2018 has to offer – even so long after its launch.

Read the full review: Google Pixelbook

Before the Google Pixelbook showed up and showed us exactly what the best Chromebooks are capable of, the Asus Chromebook Flip was the Google laptop to get. Rocking a full-fat Intel Core processor and full-HD display, the Chromebook Flip changed everything. With this Chromebook, all the features we take for granted came to life. Put simply, if you want the key features that the Pixelbook offers, but you don’t want to drop that much cash the Asus Chromebook Flip is a fantastic option. 

Read the full review: Asus Chromebook Flip 

When Android apps started heading to the best Chromebooks, it was only a matter of time before Samsung took its mastery of the two OSs and crafted something truly beautiful. With a 12.3 inch QHD touchscreen and a 360-degree hinge, the Samsung Chromebook Pro is widely acclaimed for its built-in stylus – the first of its kind to show up in a Chromebook. Not only does it show up a majority of laptops in its own category, but it’s better than most Android devices as well, even if the keyboard could use some improvement.  

Read the full review: Samsung Chromebook Pro

  • This product is only available in the US as of this writing. UK and Australian readers: check out a fine alternative in the Asus Chromebook Flip.

The best Chromebooks are kind of synonymous with education in 2018. And, with the Acer Chromebook Spin 13, Acer wants them to be ubiquitous in the business sector, as well. Beautifully built from aluminum with a beautiful QHD screen, it will not only fit into any office, but it might actually draw some envious glances. It’s more than just a pretty chassis, though – the Acer Chromebook Spin 13 is backed by full-fat Ultrabook processors, so it can get work done, and look good while doing it.

Read the full review: Acer Chromebook Spin 13

If the Samsung Chromebook Pro is all about versatility, the Dell Chromebook 11 is about value. Reinforced by a 180-degree hinge, sturdy design and a sealed keyboard and trackpad in addition to a punchy typing experience, this Chromebook is a perfectly portable package. Not only adequately suited for school and work, the Dell Chromebook 11 even packs a set of loud stereo speakers for listening to music or watching videos. Don’t worry about dinging it, either, as this device remains the most rugged Chromebook on our list.

Read the full review: Dell Chromebook 11

One of the most compelling use cases for the best Chromebooks is that of the student laptop – and the Acer Chromebook Spin 11 is a perfect example. If you’re a student, or even a parent of a student that’s looking for a cheap, capable and, more importantly, durable machine to get some homework done on the go, you shouldn’t need to look further than the Acer Chromebook Spin 11. You won’t be able to do any hardcore gaming or video editing on this thing, but if you just need something to write some papers and watch some YouTube in your downtime – you should give it a look.

Read the full review: Acer Chromebook Spin 11 

With its pristine build quality that rivals even a MacBook, it’s easy to look past the Acer Chromebook 15’s aversion to 2-in-1 form factors. However, given that most Chromebooks releasing this year are fully convertible, thanks to the wide adoption of Android app support, the Acer Chromebook 15 had to prove itself to us with more than just a nice aesthetic. So, beyond its ability to lay flat using a 180-degree hinge, this 15-inch beauty makes great strides with its battery life as well, lasting almost 17 hours in our own TechRadar battery life test. 

Read the full review: Acer Chromebook 15

We wouldn’t be surprised to see Acer replacing the Chromebook R11 – at least, judging by the recent release of the Chromebook Spin 11, but it still holds up to this day as one of the best Chromebooks on the market. It isn’t the most powerful option out there, but it still gives you full access to Android Apps on the Google Play store. What’s more, it does so on a touchscreen display that can be rotated around into tablet mode, complemented by an all-metal finish that you won’t be ashamed of. 

Read the full review: Acer Chromebook R11

In our mind, the best Chromebooks are the ones that balance a rock-bottom price and speedy use of Chrome OS – and the HP Chromebook 14 is a perfect example. While it’s similar to the Acer Chromebook 15 in a lot of ways, this 14-inch Chromebook is a bit more compact and even looks better. Complemented by a bright blue finish and a screen made to astonish, the HP Chromebook 14 boasts the best value of any Chromebook out there. Even if the battery life and performance are average – the HP Chromebook 14 is easily one of the best Chromebooks on the market right now.

Read the full review: HP Chromebook 14

The HP Chromebook 13 is way better than anyone would expect from a Chromebook. You’re getting a 1440p display, two USB-C ports and, if you’re willing to shell out a bit more cash, you can get yourself an INtel Core M processor rather than a Pentium. All of this is complemented by incredible style and a metallic design that exudes Pixel influence. It might not be as powerful as the Google Pixelbook, but it’s still one of the best Chromebooks when it comes to sheer style. 

Read the full review: HP Chromebook 13

Watch the video below for the top 7 things to consider when buying a laptop:


Juan Martinez and Gabe Carey have also contributed to this article.

Future all-Apple iPhone? Apple may make the modem in-house eventually

Future iPhone components may be more Apple, and less partners and rivals, with a new report indicating the company will invest in making cellular modems in-house.

The latest tip comes from Apple itself, which states in a job ad that it's looking for 'a cellular modem systems architect to work in its San Diego office', as first reported by The Information. That's the same city as semiconductor firm Qualcomm, which has made previous iPhone and iPad modems. 

An Apple-made modem in iOS devices would give it more direct oversight into the internal specs of its hardware. Plus, it would cut out a long-time supplier who has recently become a courtroom foe.

Sorry, Qualcomm and Intel

Apple has taken increasing control of the components that go into its hardware. Notably, it decided to design the iPhone and iPad graphics chip in-house last year, which subsequently sank PowerVR GPU maker Imagination Technologies.

The Cupertino company is also reportedly planning to start making its own ARM-based CPUs for Macs in 2020, which will replace current Intel-made chips. Apple hasn't publicly confirmed this.

Now Apple seems to be internalizing cellular modem design duties at the expense of Intel and Qualcomm, and the two semiconductor firms may be the reason it's going in-house.

Apple is in a legal dispute with Qualcomm, which supplied a majority of the modems in the iPhone and iPad, from the iPhone X on back. In 2018, Apple started to use Intel modems exclusively in the iPhone XS and iPhone XS Max, though the speed and performance hasn't matched Qualcomm's modems in our experience.

Qualcomm makes the most widely-used Android phone chips, and it just unveiled a new one: the forthcoming Snapdragon 855 chipset, which has a 5G-ready cellular modem embedded on it. 

Apple may outfit its future A-series chipsets with its own modems, but today's report hints that Apple cellular modems take another three years. That all-Apple iPhone with nothing but Cupertino-designed internals may end up being a few iPhone cycles away. 

Pages

Translate


Members login

Publications, professional articles, contributions publish for members
login

If you are not a member yet, you need to register
register

LinkExchange

LinkExchange by Front Business

Web App´s

FirefoxInternet Explorer
ThunderbirdWWW Consortium (W3C)

Terms of Use - Disclaimer - Imprint - Contact - Copyright by HEADLINES FRONT BUSINESS 2012