5 items tagged "database"

  • 6 Basic Security Concerns for SQL Databases

    Durjoy-Patranabish-Blueocean-Market-IntelligenceConsider these scenarios: A low-level IT systems engineer spills soda, which takes down a bank of servers; a warehouse fire burns all of the patient records of a well-regarded medical firm; a government division’s entire website vanishes without a trace. Data breaches and failures are not isolated incidents. According to the 2014 Verizon Data Breach Investigations Report, databases are one of the most critical vulnerability points in corporate data assets. Databases are targeted because their information is so valuable, and many organizations are not taking the proper steps to ensure data protection.

    • Only 5 percent of billions of dollars allocated to security products is used for security in data centers, according to a report from International Data Corporation (IDC).
    • In a July 2011 survey of employees at organizations with multiple computers connected to the Internet, almost half said they had lost or deleted data by accident.
    • According to Fortune magazine, corporate CEOs are not making data security a priority, seemingly deciding that they will handle a data problem if it actually happens.

    You might think CEOs would be more concerned, even if it is just for their own survival. A 2013 data breach at Target was widely considered to be an important contributing factor to the ouster of Greg Steinhafel, then company president, CEO and chairman of the board. The Target breach affected more than 40 million debit and credit card accounts at the retailing giant. Stolen data included names of customers, their associated card numbers, security codes and expiration dates.
    Although the threats to corporate database security have never been more sophisticated and organized, taking necessary steps and implementing accepted best practices will decrease the chances of a data breach, or other database security crisis, taking place at your organization.

    6 Basic Security Concerns

    If you are new to database administration, you may not be familiar with the basic steps you can take to improve database security. Here are the first moves you should make

    1. The physical environment. One of the most-often overlooked steps in increasing database security is locking down the physical environment. While most security threats are, in fact, at the network level, the physical environment presents opportunities for bad actors to compromise physical devices. Unhappy employees can abscond with company records, health information or credit data. To protect the physical environment, start by implementing and maintaining strict security measures that are detailed and updated on a regular basis. Severely limit access to physical devices to only a short list of employees who must have access as part of their job. Strive to educate employees and systems technicians about maintaining good security habits while operating company laptops, hard drives, and desktop computers. Lackadaisical security habits by employees can make them an easy target.


    2. Network security. Database administrators should assess any weak points in its network and how company databases connect. An updated antivirus software that runs on the network is a fundamental essential item. Also, ensure that secure firewalls are implemented on every server. Consider changing TCP/IP ports from the defaults, as the standard ports are known access points for hackers and Trojan horses.


    3. Server environment. Information in a database can appear in other areas, such as log files, depending on the nature of the operating system and database application. Because the data can appear in different areas in the server environment, you should check that every folder and file on the system is protected. Limit access as much is possible, only allowing the people who absolutely need permission to get that information. This applies to the physical machine as well. Do not provide users with elevated access when they only need lower-level permissions.


    4. Avoid over-deployment of features. Modern databases and related software have some services designed to make the database faster, more efficient and secure. At the same time, software application companies are in a very competitive field, essentially a mini arms race to provide better functionality every year. The result is that you may have deployed more services and features than you will realistically use. Review each feature that you have in place, and turn off any service that is not really needed. Doing so cuts down the number of areas or “fronts” where hackers can attack your database.


    5. Patch the system. Just like a personal computer operating system, databases must be updated on a continuing basis. Vendors constantly release patches, service packs and security updates. These are only good if you implement them right away. Here is a cautionary tale: In 2003, a computer worm called the SQL Slammer was able to penetrate tens of thousands of computer services within minutes of its release. The worm exploited a vulnerability in Microsoft’s Desktop Engines and SQL Server. A patch that fixed a weakness in the server’s buffer overflow was released the previous summer, but many companies that became infected had never patched their servers.


    6. Encrypt sensitive data. Although back-end databases might seem to be more secure than components that interface with end users, the data must still be accessed through the network, which increases its risk. Encryption cannot stop malicious hackers from attempting to access data. However, it does provide another layer of security for sensitive information such as credit card numbers.

    Famous Data Breaches

    Is all this overblown? Maybe stories of catastrophic database breaches are ghost stories, conjured up by senior IT managers to force implementation of inconvenient security procedures. Sadly, data breaches happen on a regular basis to small and large organizations alike. Here are some examples:

    • TJX Companies. In December 2006, TJX Companies, Inc., failed to protect its IT systems with a proper firewall. A group led by high-profile hacker Albert Gonzalez gained access to more than 90 million credit cards. He was convicted of the crime and invited to spend over 40 years in prison. Eleven other people were arrested in relation to the breach.
    • Department of Veterans Affairs. A database containing names, dates of birth, types of disability and Social Security numbers of more than 26 million veterans was stolen from an unencrypted database at the Department of Veterans Affairs. Leaders in the organization estimated that it would cost between $100 million and $500 million to cover damages resulting from the theft. This is an excellent example of human error being the softest point in the security profile. An external hard drive and laptop were stolen from the home of an analyst who worked at the department. Although the theft was reported to local police promptly, the head of the department was not notified until two weeks later. He informed federal authorities right away, but the department did not make any public statement until several days had gone by. Incredibly, an unidentified person returned the stolen data in late June 2006.
    • Sony PlayStation Network. In April 2011, more than 75 million PlayStation network accounts were compromised. The popular site was down for weeks, and industry experts estimate the company lost millions of dollars. It is still considered by many as the worst breach of a multiplayer gaming network in history. To this day, the company says it has not determined who the attacks were. The hackers were able to get the names of gamers, their email addresses, passwords, buying history, addresses and credit card numbers. Because Sony is a technology company, it was even more surprising and concerning. Consumers began to wonder: If it could happen to Sony, was their data safe at other big companies.
    • Gawker Media. Hackers breached Gawker Media, parent company of the popular gossip site Gawker.com, in December 2010. The passwords and email addresses of more than one million users of Gawker Media properties like Gawker, Gizmodo, and Lifehacker, were compromised. The company made basic security mistakes, including storing passwords in a format hackers could easily crack.

    Take These Steps

    In summary, basic database security is not especially difficult but requires constant vigilance and consistent effort. Here is a snapshot review:

    • Secure the physical environment.
    • Strengthen network security.
    • Limit access to the server.
    • Cut back or eliminate unneeded features.
    • Apply patches and updates immediately.
    • Encrypt sensitive data such as credit cards, bank statements, and passwords.
    • Document baseline configurations, and ensure all database administrators follow the policies.
    • Encrypt all communications between the database and applications, especially Web-based programs.
    • Match internal patch cycles to vendor release patterns.
    • Make consistent backups of critical data, and protect the backup files with database encryption.
    • Create an action plan to implement if data is lost or stolen. In the current computing environment, it is better to think in terms of when this could happen, not if it will happen.

    Basic database security seems logical and obvious. However, the repeated occurrences of major and minor data breaches in organizations of all sizes indicate that company leadership, IT personnel, and database administrators are not doing all they can to implement consistent database security principles.
    The cost to do otherwise is too great. Increasingly, corporate America is turning to cloud-based enterprise software. Many of today’s popular applications like Facebook, Google and Amazon rely on advanced databases and high-level computer languages to handle millions of customers accessing their information at the same time. In our next article, we take a closer look at advanced database security methods that these companies and other forward-thinking organizations use to protect their data and prevent hackers, crackers, and thieves from making off with millions of dollars worth of information.

    Source: Sys-con Media

  • Database possibilities in an era of big data

    Database possibilities in an era of big data

    We live in an era of big data. The sheer volume of data currently existing is huge enough without also grappling with the amount of new information that’s generated every day. Think about it: financial transactions, social media posts, web traffic, IoTsensor data, and much more, being ceaselessly pulled into databases the world over. Outdated technology simply can’t keep up.

    The modern types of databases that have arisen to tackle the challenges of big data take a variety of forms, each suited for different kinds of data and tasks. Whatever your company does, choosing the right database to build your product or service on top of is a vital decision. In this article, we’ll dig into the different types of database options you could be considering for your unique challenges, as well as the underlying database technologies you should be familiar with. We’ll be focusing on relational database management systems (RDBMS), NoSQL DBMS, columnar stores, and cloud solutions.

    RDBMS

    First up, the reliable relational database management system. This widespread variety is renowned for its focus on the core database attributes of atomicity (keeping tasks indivisible and irreducible), consistency (actions taken by the database obey certain constraints), isolation (a transaction’s immediate state is invisible to other transactions), and durability (data changes reliably persist). Data in an RDBMS is stored in tables and an RDBMS is able to tackle tons of data and complex queries as opposed to flat files, which tend to take up more memory and are less efficient. An RDBMS is usually made up of a collection of tables, each with columns (fields) and records (rows). Popular examples of RDBM systems include Microsoft SQL, Oracle, MySQL, and Postgres.

    Some of the strengths of an RDBMS include flexibility and scalability. Given the huge amounts of information that modern businesses need to handle, these are important factors to consider when surveying different types of databases. Ease of management is another strength since each of the constituent tables can be changed without impacting the others. Additionally, administrators can choose to share different tables with certain users and not others (ideal if working with confidential information you might not want shared with all users). It’s easy to update data and expand your database, and since each piece of data is stored at a single point, it’s easy to keep your system free from errors as well.

    No system is perfect, however. Each RDBMS is built on a single server, so once you hit the limits of the machine you’ve got, you need to buy a new one. Rapidly changing data can also challenge these systems, as increased volume, variety, velocity, and complexity create complicated relationships that the RDBMS can have trouble keeping up with. Lastly, despite having 'relation' in the name, relational database management systems don’t store the relationships between elements, meaning that the system doesn’t actually understand the connections between data as pertains to various joins you may be using. 

    NoSQL DBMS

    NoSQL (originally, 'non relational' or 'not SQL') DBMS emerged as web applications were becoming more complex. These types of databases are designed to handle heterogeneous data that’s difficult to stick in a normalization schema. While they can take a wide array of forms, the most important difference between NoSQL and RDBMS is that while relational databases rigidly define how all the data contained within must be arranged, NoSQL databases can be schema agnostic. This means that if you’ve got unstructured and semi-structured data, you can store and manipulate it easily, whereas an RDBMS might not be able to handle it at all. 

    Considering this, it’s no wonder that NoSQL databases are seeing a lot of use in big data and real-time web apps. Examples of these database technologies include MongoDB, Riak, Amazon S3, Cassandra, and Hbase. However, one drawback of NoSQL databases is that they have 'eventual consistency', meaning that all nodes will eventually have the same data. However, since there’s a lag while all the nodes update, it’s possible to get out-of-sync data depending on which node you end up querying during the update window. Data consistency is a challenge with NoSQL since they do not perform ACID transactions.

    Columnar storage database

    A columnar storage database’s defining characteristic is that it stores data tables by column rather than by row. The main benefit of this configuration is that it accelerates analyses because the system only has to read the locations your query is interested in, all within a single column. Also, these systems compress repeating volumes in storage, allowing better compression, since the data in one specific column is homogeneous across all the columns (or, columns are all the same type: integers, strings, etc. so that they can be better compressed). 

    However, due to this feature, Columnar storage databases are not typically used to build transactional databases. One of the drawbacks of these types of database is that inserts and updates on an entire row (necessary for apps like ERPs and CRMs, for example) can be expensive. It’s also slower for these types of applications. For example, when opening an account’s page in a CRM, the app needs to read the entire row (name, address, email, account id, etc) to populate the page and write back all that as well. In this example, a relational database would be more efficient. 

    Cloud solutions

    While not technically a type of database themselves, no discussion of modern types of database solutions would be complete without discussing the cloud. In this age of big data and fast-moving data sources, data engineers are increasingly turning to cloud solutions (AWS, Snowflake, etc.) to store, access, and analyze their data. One of the biggest advantages of cloud options is that you don’t have to pay for the physical space or the physical machine associated with your database (or its upkeep, emergency backups, etc.). Additionally, you only pay for what you use: as your memory and processing power needs scale up, you pay for the level of service you need, but you don’t have to pre-purchase these capabilities.

    There are some drawbacks to using a cloud solution, however. First off, since you’re connecting to a remote resource, bandwidth limitations can be a factor. Additionally, even though the cloud does offer cost savings, especially when starting a company from scratch, the lifetime costs of paying your server fees could exceed what you would have paid buying your own equipment. Lastly, depending on the type of data you’re dealing with, compliance and security can be issues because the responsibility of managing the data and its security is no longer handled by you, the data owner, and instead by the third party provider. For example, unsecured APIs and interfaces that can be more readily exploited, data breaches, data loss or leakage risks can be elevated, and unauthorized access through improperly configured firewalls are some ways in which cloud databases can be compromised.

    Decision time

    The era of Big Data is changing the way companies deal with their data. This means choosing new database models and finding the right analytics and BI tools to help your team get the most out of your data and build the apps, products, and services that will shape the world. Whatever you’re creating, picking the right database type for you, and build boldly.

    Author: Jack Cieslak

    Source: Sisense

  • Down to Business: Seven tips for better market intelligence

    business-analysisMaking decisions about product and service offerings can make or break your success as a business. Business owners, executives and product managers need good information and data to make the most informed product decisions.

    This critical information about markets, customers, competitors and technology is called market intelligence. Market intelligence combined with analysis provides market insight and allows better decision making.

    Here are seven tips for better market intelligence:

    1. Develop a process: Your ability to harness, manage and analyze good data is vital to your success. Assure you develop a process for gathering, storing and utilizing market intelligence. Take the time to train your team and invest in a robust market intelligence process. It's an investment with an excellent return.

    2. Gather data when you lose: Often when a company loses an order we ask the salesperson what happened and they offer an opinion. It's important to drill down and really understand why you lost an important order. I recall a situation years ago where a salesperson's opinion was very different from what ultimately was the actual reason we lost this large order. Understanding the real reason for the loss assures you are far more likely to choose correct strategies to win the order in the future. Trust, but verify.

    3. Attend trade shows: You should attend trade shows and use them as a fact-finding mission. Trade shows are like one-stop shopping for market intelligence. There are industry analysts, suppliers, customers and industry media all in one location. Use your time wisely to engage with as many people as possible and utilize your listening skills. It's always best to plan ahead for trade shows, to make the best use of your limited time there. Make sure you stay at the hotel suggested by the show organizers. The "show hotel" may cost a little more than other hotels in the area, but you will have far more opportunities to gather information. You can also consider hiring someone, who does not work for your company, to gather information at trade shows, or speak with an industry analyst. This "stealth mode" of gathering market intelligence can provide added benefits.

    4. Take a customer to lunch: Understanding your customers, their challenges and their perception is one of the best ways to gain market insight. Ultimately it is your customer's perceptions that determine your brand positioning. Spending time with your customers, listening to them and acting on these insights, can provide you with an amazing competitive advantage.

    5. Build a database: Data can be hard to find as time moves forward and people leave an organization. It's worthwhile to build a central database of your market intelligence. By indexing this data it becomes easy for your product managers and executives to have access to the best information when making decisions.

    6. Assure you have good data: It takes good, accurate data for the best results; never forget this point. Good data means better decisions. Accuracy can be improved by using multiple sources and considering how any specific source may be biased. Bad information leads to poor decisions. Ensure you are gathering good data.

    7. Train your team: You cannot gather good data that provides market intelligence unless you have a team of professionals that understands how to gain the best market insights. Assure you have a team that is trained not only on how to gather market intelligence, but how to analyze and use the data for better decision making. As an example we offer a product management boot camp that covers this subject in detail, among others.

    Developing market intelligence takes work as well as a robust methodology. It's not a one-time event, but a continuous process. The absence of good data leads to suboptimal decisions. Good data leads to better decision-making and success for your organization.

  • Everything you need to know about a database management system and its uses

    Everything you need to know about a database management system and its uses

    Strong database management facilitates fast and effective business decision-making.

    Data drives everyday decision-making to help businesses complete tasks and accomplish their goals. Therefore, it requires proper management. But the question is how to effectively manage business data to ensure quick decision-making and smooth workflows? Using a database management system is the answer.

    A database management system makes it easier to store, organize, and share data across your business departments. It pulls data from the various tools, platforms, and applications your business uses and centralizes its storage so it can be easily searched and retrieved. It also eliminates risks such as data loss that delay or disrupt daily workflows.

    If you’re someone who works with data day in and day out or who relates to the everyday challenges of managing databases, this blog is for you. We explain what a database management system is and how you can use it to ensure data integrity and streamline data management processes.

    What is a database management system?

    A database management system is a software platform that helps you store and organize data. It creates a single centralized data source that can be used by stakeholders across departments. It combines the capabilities of data manipulation, analytics, and reporting to ensure better use of key data points.

    A database management system acts as an interface between your databases and employees. Employees can add, update, access, and delete data in the databases, based on the levels of permissions you assign to them. You can use database management software for:

    • Data management: Store, manage, categorize, and update business data.
    • Data retrieval: Find specific data points using the search functionality.
    • Queries: Run queries to perform specific actions such as calculations.
    • Data replication: Create duplicate instances of data and use them as a distributed database among employees.
    • Data security: Ensure data is secure from malicious attacks, unauthorized access, and accidents such as deleted data.
    • Data conversion: Transfer data from one database to another—also known as data migration.

    Why do you need a database management system?

    For people like you who depend on data to get their jobs done, using a database management system has multiple benefits. It assists with structured data management to ensure easy access and sharing. It also frees you from time-consuming manual processing tasks such as finding a specific data point and sharing it with employees.

    In addition, database management software ensures business data is shared only with relevant internal or external stakeholders. This helps mitigate risks such as information loss or unauthorized access.

    Here are a few benefits of implementing a database system into your work processes:

    • Increases productivity due to fewer data-related errors
    • Speeds up decision-making with timely and uninterrupted access to data
    • Improves data sharing and security by allowing access to only authorized users

    Your business’s need for database management software depends on how your employees use data. For instance, some might use it for daily research (normal priority), while others might use it to develop software tools (high priority). Keep such usage scenarios in mind when deciding whether or not to use database management systems.

    1. Relational database management system

    A relational database is a collection of data that is related to each other so different data points can be combined for better usability. The related points could be time, data, or logic, and the relation can be categorized in the following ways:

    • One on one: A data point in one table is related to a data point in another table.
    • One to many: A data point in one table is related to multiple data points in another table.
    • Many to one: Multiple data points in one table are related to a data point in another table.
    • Many to many: Multiple data points in one table are related to multiple data points in another table.

    A relational database management system is software that manages the storage and shareability of relational databases. It organizes data in a relational database by forming functional dependencies between multiple data points. It also stores data in an organized manner so it’s easier for employees to find and use data for their daily tasks.

    A relational data structure uses structured query language (SQL) to allow employees to run queries and find the information they need. A relational database management system typically:

    • Stores large volumes of data
    • Enables fast data-fetching
    • Allow users to simultaneously access multiple data elements

    2. Object-oriented database management system

    An object-oriented database is a collection of data that is presented in the form of an object. Multiple data points are combined into a single unit or object, making it easier for employees to find and use data. This type of database is used to accomplish high-performance tasks, such as software development and programming, that require faster decision-making.

    An object-oriented database management system is software that stores and manages databases as objects. It allows employees to look for complete objects instead of individual data points, resulting in a quicker search. An object-oriented database structure typically:

    • Maintains a direct relationship between database objects and real-world scenarios so the objects don’t lose their purpose
    • Provides an object identifier for employees to quickly locate objects and use them
    • Handles different data types such as pictures, text, and graphics

    3. Hierarchical database management system

    A hierarchical database is a collection of data that is organized into a tree-like structure wherein the stored data is connected through links and arranged from top to bottom. The primary data point is at the top, and the secondary data points follow in hierarchy depending on their relevance. Your business’s organizational structure is a perfect example of a hierarchical database.

    A hierarchical database management system is software that stores and manages hierarchical databases. It maintains accuracy in data hierarchy or flow based on the usage in work processes. Data within a hierarchical system is typically:

    • Easy to add and delete
    • Easy to search and retrieve
    • Follows a one-to-many relational data model

    4. Network database management system

    A network database is a collection of data where each data point is connected to multiple primary and secondary data points. Having interconnected data points makes this data model more flexible in terms of usage.

    A network database management system is software that stores and manages the interrelated data points in a network database. This software was built to overcome the shortcomings of a hierarchical database model that doesn’t allow interconnection between data points, besides the top-to-bottom flow. A network database system typically:

    • Facilitates quick data access
    • Supports many-to-many relational database models
    • Allows to create and manage complex database structures

    Who uses a database management system?

    In the table below, we share a couple of examples of professionals who use a database management system. Please note that these are just a few examples, and there are many such professionals for whom data is on top priority to accomplish tasks.

    User

    Usage

    Application programmers

    These are professionals who interact with databases to develop software apps and tools. They mostly use an object-oriented database management system to write codes and then convert them into objects for better usability. Converting large codes into smaller objects makes it less confusing for application programmers, especially when checking the performance of the developed applications.

    Data analysts

    These are professionals who collect raw business data and organize it into a database. They mostly use SQL in a relational database management system to identify raw data, draw valuable insights from it, and convert the insights into action points to impact business decision-making.

    DBMS software applications are also used in the following industry functions:

    • Railway reservation systems: A database management system is used to manage information such as ticket bookings, train timings, and arrival/departure status.
    • Library management: A database management system is used in libraries to manage the list of books. This includes keeping track of issuing dates, patron names, and author names.
    • Banking and finance: A database management system is used to manage the list of bank transactions, mode of payments, account details, and more.
    • Educational institutions: A database management system is used to manage the list of students, classes, lecture timings, and the number of hours logged in by both teachers and students.

    Use database management systems to enhance business decision-making

    Data is key to better decision-making, and efficient database management is key to getting data right. Therefore, it’s essential to manage your business data for effective usage, accessibility, and security.

    Author: Saumya Srivastava

    Source: Capterra

  • Top 6 Database Performance Metrics to Monitor in Enterprise Applications

    The previous article presented an introduction SQL and NoSQL. This article builds on these topics by reviewing six of the top performance metrics to capture to assess the health of your database in your enterprise application. Specifically this article reviews the following:

    • Business Transactions
    • Query Performance
    • User and Query Conflicts
    • Capacity
    • Configuration
    • NoSQL Databases
    • Business Transactions

    Business Transactions provide insight into real user behavior: they capture real-time performance that real users are experiencing as they interact with your application. As mentioned in the previous article, measuring the performance of a business transaction involves capturing the response time of a business transaction holistically as well as measuring the response times of its constituent tiers. These response times can then be compared with the baseline that best meets your business needs to determine normalcy.
    If you were to measure only a single aspect of your application, I would encourage you to measure the behavior of your business transactions. While container metrics can provide a wealth of information and can help you determine when to auto-scale your environment, your business transactions determine the performance of your application. Instead of asking for the CPU usage of your application server you should be asking whether or not your users can complete their business transactions and if those business transactions are behaving optimally.


    As a little background, business transactions are identified by their entry-point, which is the interaction with your application that starts the business transaction.
    Once a business transaction is defined, its performance is measured across your entire application ecosystem. The performance of each business transaction is evaluated against its baseline to assess normalcy. For example, we might determine that if the response time of the business transaction is slower than two standard deviations from the average response time for this baseline that it is behaving abnormally, as shown in figure 1.

    uynxcjx01sa6ia6g9n7p5arheei-ikdgpusrh1kg0c3wwyssuey6qlvprenvlbys 76wbdfnuyicw-deimeshantq5aowd6mwrvzlkwy6hezfkvc7ytlym5jnov6uakzctzcnqti

    Figure 1 Evaluating BT Response Time Against its Baseline


    The baseline used to assess the business transaction is consistent for the hour in which the business transaction is running, but the business transaction is being refined by each business transaction execution. For example, if you have chosen a baseline that compares business transactions against the average response time for the hour of the day and the day of the week after the current hour is over, all business transactions executed in that hour will be incorporated into the baseline for next week. Through this mechanism an application can evolve over time without requiring the original baseline to be thrown away and rebuilt; you can consider it as a window moving over time.
    In summary, business transactions are the most reflective measurement of the user experience, so they are the most important metric to capture.


    Query Performance

    The most obvious place to look for poor query performance is in the query itself. Problems can result from queries that take too long to identify the required data or bring the data back. Look for these issues in queries:


    Selecting more data than needed
    It is not enough to write queries that return the appropriate rows; queries that return too many columns can cause slowness both in selecting the rows and retrieving the data. It is better to list the required columns rather than writing SELECT*. When the query is based on selecting specific fields, the plan may identify a covering index, which can speed up the results. A covering index includes all the fields used in the query. This means that the database can generate the results just from the index. It does not need to go to the underlying table to build the result. Additionally, listing the columns required in the result reduces the data that’s transmitted, which also benefits performance.


    Inefficient joins between tables
    Joins cause the database to bring multiple sets of data into memory and compare values, which can generate many database reads and significant CPU. Depending on how the tables are indexed, the join may require scanning all the rows of both tables. A poorly written join on two large tables that requires a complete scan of each one is very computationally expensive. Other factors that slow down joins include joining on columns that are different data types, requiring conversions, or a join condition that includes LIKE, which prevents the use of indexes. Avoid defaulting to using a full outer join; use inner joins when appropriate to bring back only the desired data.


    Too few or too many indexes
    When there aren’t any indexes that the query optimizer can use, the database needs to resort to table scans to produce query results, which generates a large amount of disk input/output (I/O). Proper indexes also reduce the need for sorting results. Indexes on non-unique values do not provide as much help as unique indexes in generating results. If the keys are large, the indexes become large as well, and using them creates more disk I/O. Most indexes are intended to help the performance of data retrieval, but it is important to realize that indexes also impact the performance of data inserts and updates, as all associated indexes must be updated.


    Too much literal SQL causing parse contention
    Before any SQL query can be executed, it must be parsed, which checks syntax and permissions before generating the execution plan. Because parsing is expensive, databases save the SQL they’ve parsed to reuse it and eliminate the parsing time. Queries that use literal values cannot be shared, as the WHERE clauses differ. This results in each query being parsed and added to the shared pool. Because the pool has limited space, some saved queries are discarded to make room. If those queries recur, they need to be parsed again.


    User and Query Conflicts

    Databases are designed to be multi-user, but the activities of multiple users can cause conflicts.


    Page/row locking due to slow queries
    To ensure that queries produce accurate results, databases must lock tables to prevent inserts and updates from occurring while a read query is running. If a report or query is slow, users who need to modify database values may experience slowness and delays in completing their updates. Lock hints help the database use the least disruptive locks. Separating reporting from transactional databases is also an efficient solution.


    Transactional locks and deadlocks
    Deadlocks occur when two transactions are blocked because each one needs a resource held by the other. When there’s a normal lock, a transaction is blocked until a resource is released. There isn’t a resolution to a deadlock. Databases monitor for deadlocks and choose to terminate one of the blocked transactions, freeing the resource and allowing the other transaction to proceed. The other transaction is rolled back.


    Batch activities causing resource contention for online users
    Batch processes typically perform bulk operations such as loads of large amounts of data or generating complex analytical reports. These operations are resource-intensive and can impact performance for online users. The best solution for this issue is to ensure that batch operations are run when online usage is low, such as at night, or to use separate databases for transactional processing and analytical reporting.


    Capacity

    Not all database performance issues are database issues. Some problems result from running the database on inadequate hardware.


    Not enough CPUs or CPU speed too slow
    More CPUs can share the server workload, resulting in improved performance. The performance the database experiences is not solely due to the database but also is affected by other processes running on the server, so it is important to review the overall load as well as database usage. As CPU utilization varies throughout the day, metrics should be examined for periods of low usage, average usage, and peak usage to best assess whether additional CPU resources will be beneficial.


    Slow disk without enough IOPS
    Disk performance can be stated in terms of input/output operations per second (IOPS). Combined with the I/O size, this provides a measure of the number of the disk’s throughput in terms of megabytes per second. This throughput is also affected by the disk’s latency, which is how long it takes the request to complete. These metrics are unique to the technology of the disk storage. Traditional hard disk drives (HDD) have a rotating disk and are typically slower than solid state drives (SSD) or flash memory without any moving parts. Until recently, an SSD was more expensive than an HDD, but costs have come down, making it a competitive option.


    Full or misconfigured disks
    Databases obviously require significant disk access, so incorrectly configured disks have a considerable performance impact. Disks should be suitably partitioned, with system data such as catalogs and logs separated from user data. Highly active tables should be separated to avoid contention. Increase parallelism by placing databases and indexes on different disks. Don’t arrange the operating system and swap space on the same disk as the database.


    Not enough memory
    Limited or poorly allocated physical memory impacts database performance. The more memory that is available, typically the better the performance will be. Monitor paging and swapping. Set up several page spaces on multiple, non-busy disks. Make sure the paging space allocated is sufficient for database requirements; each database vendor can provide guidance on this matter.


    Slow network
    Network speeds can affect how quickly retrieved data is returned to the end user or calling process. Use broadband for connecting to remote databases. In some cases, choosing TCP/IP instead of named pipes for the connection protocol can significantly increase performance.


    Configuration

    Every database has a large number of configuration settings. Default values may not be enough to give your database the performance it needs. Check all parameter settings, which includes looking for the following issues:


    Buffer cache too small
    Buffer cache improves performance by storing data in kernel memory and eliminating disk I/O. When the cache is too small, data is flushed from the cache more frequently. If it is needed again, it must be reread from disk. Besides the slowness of the disk read, this puts additional work on I/O devices and can become a bottleneck. In addition to allocating enough space to the buffer cache, tuning SQL queries can help them use buffer cache more efficiently.


    No query caching
    Query caching stores both database queries and their result sets. When an identical query is executed, the data is quickly retrieved from memory rather than requiring the query to be executed again. Updates to data invalidate the results, so query caching is only effective on static data. In some cases, a query cache can become a bottleneck rather than a benefit to performance. Huge caches can cause contention when they are locked for updates.


    I/O contention due to temporary table creation on disk
    Databases need to create temporary tables when performing certain query operations, such as executing a GROUP BY clause. When possible, the temporary tables are created in memory. However, in some cases, creating the temporary table in memory is not feasible, such as when the data contains BLOB or TEXT objects. In those cases, the temporary tables are created on disk. A large amount of disk I/O is required to create the temporary table, populate it with records, select the needed data from it, and drop the table when the query is complete. To avoid potential performance impact, the temporary database should be separated from the main database space. Rewriting queries also reduce the need for temporary tables by creating derived tables instead. Using a derived table, which directly selects from the result of another SELECT statement, allows data to be joined in memory rather than using disk.


    NoSQL Databases

    NoSQL has much appeal because of its ability to handle large amounts of data very rapidly. However, some disadvantages should be assessed when weighing if NoSQL is right for your use-case scenario. This is why it is wise to consider that NoSQL stands for “Not Only SQL.” This clearer definition accepts the premise that NoSQL is not always the right solution, nor does it necessarily replace SQL across the board — here are five reasons why:


    Finicky Transactions
    It is hard to keep entries consistent with NoSQL. When accessing structured data, it does not always ensure that changes to various tables are made at the same time. If a process crashes, tables might become inconsistent. An example of consistent transactions is double-entry accounting. A corresponding credit must balance every debit and vice versa. If the data on both sides is not consistent, the entry cannot be made. NoSQL may not “balance the books” properly.


    Complex Databases
    Supporters of NoSQL tend to point to the efficient code, simplicity, and speed of NoSQL. All of these factors line up when database tasks are simple. However, when databases become more complicated, NoSQL begins to break down. SQL has more potential over NoSQL when database demands are complicated because SQL has mature, industry standard interfaces. Each NoSQL setup has a unique interface.


    Consistent JOINS
    When executing a JOIN in SQL, there is a tremendous amount of overhead because the system must pull data from different tables and align them with keys. NoSQL seems like a dream because there is a lack of JOINS. Everything is all together in one place in the same table. When data is retrieved, it pulls all of the key-value pairs at the same time. The problem is that this can create several copies of the same data. Those copies have to be updated, and NoSQL does not have the functionality to help in this situation.


    Flexibility in Schema Design
    NoSQL was unique when it emerged on the scene because it did not require a schema. In previous database models, programmers would have to think about the columns they needed to accommodate all of the potential and data entries in each row. With NoSQL, entries can have a variety of strings or none at all. This flexibility allows programmers to ramp up applications quickly. However, it can be problematic when there are several groups working on the same program, or when new teams of developers take over a project. After some developers have modified the database using the freedom of NoSQL, there may be a wide variety of key pair implementations.


    Resource Intensive
    NoSQL databases are commonly much more resource intensive than relational databases. They require much more CPU reserves and RAM allocation. For that reason, most shared hosting companies do not offer NoSQL. You must sign up for a VPS or run your own dedicated server. On the other hand, SQL is made to run on one server. This works out fine in the beginning, but as database demands increase, the hardware must expand as well. The problem is that a single server with huge capacity is much more expensive than a variety of smaller servers. The price increase is exponential. This provides one reason NoSQL has found a home in enterprise computing scenarios, such as those used by Google and Facebook.


    Conclusion

    This article presented a top-6 list of metrics that you might want to measure when assessing the health of your database. In summary, those top-6 items were:

    • Business Transactions
    • Query Performance
    • User and Query Conflicts
    • Capacity
    • Configuration
    • NoSQL Databases

    In the next article, we are going to pull all of the topics in this series together to present the approach that AppDynamics took to implementing its APM strategy. This is not a marketing article, but rather an explanation of why certain decisions and optimizations were made and how they can provide you with a robust view of the health of your applications and database.

    Source: Sys-Con

EasyTagCloud v2.8